Fight Web to Lead Spam w/ Akismet

This functionality has been improved, runs native on force.com and released on the AppExchange. Learn more on our website.

Back in January, I posted about a way to help combat web to lead spam. That type of solution works well, but is not scalable. Also, it is a reactive approach rather than a proactive one.

I decided to try and see if I could incorporate Akismet into the web to lead process and I was successful in doing so! I created a set of scripts for you to download if you’d like to leverage Akismet with your web to lead forms.

Akismet is the best spam tool I have ever used. When I posted in January, it had captured 7,680 spam in its existence on this blog. Less than 4 months later, the spam count is up to 23,994. Needless to stay, spam is an exponentially increasing problem and will plague your Salesforce.com environment eventually. I feel that Salesforce.com needs to include a spam filter in their product for web to lead (vote for it).

Until then, leveraging Akismet could help you significantly.

About the Solution

The scripts are intended to be proof of concept for you to use and apply to your own environment. The code and downloads are now being officially hosted at Arrowpointe’s Open Source Project at Google.

To use this solution, you’ll point your web to lead forms to this script rather than to the standard Salesforce Web to Lead page. The script accepts the data from your web to lead form, passes it to Akismet to determine whether it’s spam or not and then passes the data to Salesforce.com’s web to lead page with the Akismet result appended to it.

In Salesforce, you’ll need to add checkbox field (e.g. “Akismet marked as spam”) on your Lead. If Akismet thinks it’s spam, that field will be set to TRUE. You would then need to add assignment rules or validation rules to do whatever you need. For example, you could have a lead assignment rule looking at that one checkbox field and put leads into a special queue if they are marked as spam.

The script leverages the Akismet PHP5 Class to handle the core communication with Akismet. I found this class from the Akismet Development page.

This script will only work on PHP5 and requires the cURL module to be enabled. cURL is enabled by default in most PHP installations. The PHP5 requirement is a limitation of the Akismet PHP5 Class. If you are on another platform (PHP4, Ruby, Java, etc.), I don’t see any reason why you couldn’t use these scripts and integrate a different Akismet toolkit into it. Additional toolkits are available from the Akismet Developer Page.

Cost

You’ll need a server running PHP5. If you have a server already setup, then hardware should cost you nothing.

Akismet is free for personal use, but has a small license fee for commercial use. If you will be using this in production, you should purchase an Akismet Commercial License Key. During development, I am sure you could use a personal key to make sure it works.

The scripts themselves are free and licensed under the GNU General Public License v3. I did this as a proof of concept for the betterment of Salesforce.com data quality everywhere.

Getting Started
  1. Add a checkbox field to your Salesforce.com Lead Object that will hold whether Akismet thinks it’s spam or not.
  2. akismet_marked_as_spam.png

  3. Download the scripts. Three files will exist in the zip file:
    • index.php: The main script that handles the incoming data, talks to Akismet and posts the data to Salesforce.com Web to Lead.
    • constants.php: You will need to go into this file and edit some variable values based upon your own organizational setup. See the next step.
    • Akismet.class.php: This is the Akismet PHP5 class I was talking about above.
  4. Edit the constants.php file:
    • Enter your WordPress API key where it says ENTER_WORDPRESS_API_KEY. Get a personal or commercial key if you don’t have one.
    • Enter your company URL where it says ENTER_YOUR_COMPANY_URL.
    • Enter your company’s Salesforce.com Org ID where it says ENTER_YOUR_SALESFORCE_ORG_ID. This is actually optional. Doing this allows you to remove the OID from your public web to lead forms so spammers don’t know your Org ID. Doing this will reduce the amount of spam you actually need to process.
    • Generate a web to lead form in your Salesforce setup. Find the new Salesforce field you created in step 1 and copy the id value from the HTML form and put it in the constants.php file where it says ENTER_THE_W2L_CUSTOMFIELD_NAME. This step is required so that your Salesforce org is actually populated with the Akismet result.
    • PHP Advanced: The $Akismet_noPass array holds the names of fields that should not be included in the content passed to Akismet. Feel free to add/remove values from this array. The values in the array are referring to the names of form fields in your web to lead HTML form. I have no idea if this helps/hurts, but it seemed like a practical thing to add into the script.
  5. Upload the scripts to your web server and note the fully qualified URL for that directory (e.g. http://www.example.com/web2lead)
  6. Update your web to lead forms to have them post to the location of the files from the previous step (e.g. http://www.example.com/web2lead/ – make sure to put the / at the end of the URL. Not sure why, but I wasn’t able to get it to work without it)
  7. Test your form to see if it works. The script acts as an intermediary between your form and Salesforce web to lead. The end-user experience should be the exactly the same with or without the scripts.
  8. Once you know it works, you should add a lead assignment rule into Salesforce as rule #1 that looks to see if this field is checked. If so, then route the lead to a “Potential Spam” queue or something of that nature. Another option is to create a validation rule that doesn’t even allow the lead into the system.
  9. Make sure your Auto Response rules don’t email a reply to leads marked as spam. If you allow this, then those spammers have an email address to try.
  10. Update some/all of your web to lead forms to post to this new page. If desired, remove the “oid” field from the HTML form for each of these since the script will pass your Org ID to Salesforce automatically.
Other Stuff

These scripts are a proof of concept. I am not officially supporting them, but am happy to help people out informally. Post comments here if you have questions/comments/criticisms.

I have only tested this with leads that were either real or very obviously spam. It worked well. From Akismet’s perspective, the data it checks looks just like a blog comment and I can attest that Akismet is amazing at identifying blog comment spam. So it should work well for Salesforce.com web to lead.

I am going to update my existing web to lead forms on this site and see how it works and report back to you. I encourage you to do the same and let me know your experiences with it or recommendations on how to improve the scripts.

Enjoy!

61 Comments

  1. Scott Hemmeter Said,

    October 9, 2009 @ 10:58 am

    @Darren,

    I am not sure. What I’d do is create a simple page to catch the post and review what’s contained in the $_POST data for that field.

    You might also want to check out the Spam Check app on AppExchange. It’s been rewritten in Visualforce so it all runs on the platform.

  2. Flashman Said,

    January 30, 2010 @ 10:43 am

    I’m confused, is this for a WordPress form or any form? The first bullet in #3 says to enter my WordPress key. What if I just have an HTML page?

  3. Scott Hemmeter Said,

    January 31, 2010 @ 9:38 pm

    @Flashman, it is not a WordPress form. It does use Akismet, which is a spam service run by the folks at WordPress (Automattic actually), so they use the WordPress key for authentication.

  4. Mike C Said,

    March 2, 2010 @ 10:00 am

    Scott, thanks for publishing this code. I can’t believe SFDC doesn’t have a good answer for web-to-lead spam.

    The problem I’m having seems to be with the custom field. I set up “Marked as Spam” to be checked by default, assuming the scripts would deselect the checkbox for all valid forms I pass through it. This is working great for my test form. Spammy submissions mark the field checked, non-spammy submissions deselect the checkbox. The issue is that the spam isn’t passing through the scripts — it’s going to SFDC directly. Although I set up the custom field to be checked by default, spam leads entered through the API seem to ignore this field.

    Is there a way to make a custom field required by default? Any other ideas?

    Many thanks for your effort to help the SFDC community!

  5. Scott Hemmeter Said,

    March 2, 2010 @ 10:29 am

    @Mike C, there is a good native solution for web to lead spam (here)

    The way the PHP solution works is that it works as an intermediary between your website and Salesforce’s web to lead process. It takes in the data, checks if its spam and then sends the data onto Salesforce as if you had that checkbox set in the first place. Any other activity in your system does not run through the process. It is then up to you to handle what you want to do with the spam. Personally, I route it to a “Spam” queue, check it out and delete it if it’s real spam.

    The way my updated, native force.com solution works is similar. You have a Sites page that acts as an intermediary. The same concept applies about you routing it where you need it to go. The additional piece of the new app (called Spam Check) is that it also includes Apex methods that allow you to do a spam evaluation in your own code. Thus, you could add a trigger to any object that checks if it’s spam and do what you want to with the results.

    If I didn’t answer your question well here and it requires a more detailed conversation, submit your contact info here and we can talk.

  6. John Wood Said,

    March 12, 2010 @ 2:54 pm

    Great set of scripts, thanks!

  7. Andrew Said,

    March 7, 2012 @ 11:47 pm

    Scott, great script! works like a charm!

    2 quick questions:

    1. is it possible to not post to chatter based on if it is spam or not?
    2. is there a way to pass through the IP it was submitted from? This way we can blacklist their IP for repeat offenders/bots.

    Thanks!

  8. Scott Hemmeter Said,

    March 8, 2012 @ 9:46 am

    @Andrew,

    Regarding #1, I assume you are referring to creating a feedpost that the record was created. This is not done by the app, it’s done by Salesforce in general. The app doesn’t know anything about Chatter.

    #2 – you’d have to capture the IP Address on your own in a hidden form field just like you’d have to do now. The system will utilize the IP address when asking the Akismet service whether it’s spam or not, but doesn’t specifically pass it into your org.

  9. Andrew Said,

    March 9, 2012 @ 12:53 am

    Scott,

    Thanks for the reply,

    I’ve implemented a hidden IP Address field, and have gotten it to pass it along and display in salesforce (like what you did above for akismet). It works when I test it and displays my IP, but when the spam comes in the field is blank! Any ideas?

    I really need to block this, i know it is all coming from the same person/bot etc. because the company fields and names are exactly the same…the form has been live for 24 hours and I’ve gotten over 500 spam submissions!!

  10. Scott Hemmeter Said,

    March 9, 2012 @ 9:22 am

    @Andrew, the only thing I can think of is that the spammers are not using your live form, but instead are using your old form or they just know your org Id and are pumping data into the org. Or they are not using a browser and are just scripting the communications. If you are capturing the IP address with JavaScript on the page, it’ll never get it. With knowledge of an org Id, anyone could post leads into a salesforce org.

    Try getting the IP address in PHP instead. The server side should know the IP no matter how it’s being communicated with.

    So spam checking or not, it seems you need something inside Salesforce. Ideas…
    – Create a Validation Rule to block the current spam from coming in at all.
    – Create a Spam queue and have workflow or assignment rules auto-assign the spam to that queue. Periodically delete the data in that queue. At least it’s out of the way.
    – Adjust your Auto Response Rules so you are not emailing the spammers back.

  11. Alanna Jackson Said,

    April 16, 2012 @ 1:09 pm

    One of our clients, VanillaSoft, was experiencing quite a bit of web to lead spam. VanillaSoft is a lead management software (www.vanillasoft.com?pmc=jms) and they too had to implement a verification checking system to verify that it is human vs. bot. Even with the system in place, they still get quite a few spam sign-ups. Are bots that good and have that many companies deployed humans to get past the filters?

RSS feed for comments on this post