Eliminating bad traffic targeting Drupal and Solr
a short post-mortem on eliminating all of that bad search traffic that was triggering multiple reds throughout the day.
I put a number of vaguely regex filters in place on the load balancer and then redirect things to a Google search. We were seeing hits directly to Solr through Drupal on the order of hundreds per minute that put high loads on the entire web stack. Eventually this triggered failures. Since the URLs being used were easy to parse using only a few search terms repeatedly, building a rule to block them was easy. The bigger question was where to put the blockers. Ultimately blocking these attacks at the load balancer was the best idea.
An additional concern was the possibility of snagging legit student searches in the filters. This is addressed by allowing for at least some of the search filters to remain in place. For example, a search for “commercial lease” still works as do the drill downs by content type. If an attempt is made to drill down by topic after the initial search for “commercial lease” the rule is engaged and access is redirected to a Google search page for “CALI Lessons commercial lease”. By using the redirect to a Google search we offer a return path for any unsuspecting visitor doing a legit search.
Below are the lines added to the HAProxy configuration to enable the rules that block the bad actors from the site.
acl match_query url_reg /search/site/commercial%20lease\?f%5B0%5D=im_field_cali_topics.*$
http-request redirect location https://www.google.com/search?q=CALI+Lessons+commercial+lease if match_query
acl match_query url_reg /search/site/bar%20prep\?f%5B0%5D=im_field_cali_topics.*$
http-request redirect location https://www.google.com/search?q=CALI+Lessons+bar+prep if match_query
acl match_query url_reg /search/site/later\?f%5B0%5D=im_field_cali_topics.*$
http-request redirect location https://www.google.com if match_query
acl match_query url_reg /search/site/commercial%20please\?f%5B0%5D=im_field_cali_topics.*$
http-request redirect location https://www.google.com/search?q=CALI+Lessons+commercial+lease if match_query