Scunthorpe problem

From Infogalactic: the planetary knowledge core
Jump to: navigation, search

The Scunthorpe problem is the blocking of emails, forum posts or search results by a spam filter or search engine because words in their text contain a string of letters that are shared with an obscene word or one that is considered improper or inadmissible for other reasons. While computers can easily identify strings of text within a document, broad blocking rules may result in false positives, causing innocent phrases to be blocked.

Origin and history

The problem was named after an incident in 1996 in which AOL's profanity filter prevented residents of the town of Scunthorpe, Lincolnshire, England, from creating accounts with AOL, because the town's name contains the substring cunt.[1] Years later, Google's filters apparently made the same mistake, preventing residents from searching for local businesses that included Scunthorpe in their names.[2]

Other examples

Mistaken decisions by obscenity filters include:

Refused web domain names and email addresses

  • In April 1998, Jeff Gold attempted to register the domain name shitakemushrooms.com, but he was blocked by an InterNIC filter prohibiting the "seven dirty words" which was active between 1996 and the transfer of control to ICANN later the same year.[3] (shitake is the Japanese name for the edible fungus Lentinula edodes.)
  • In 2000, a Canadian television news story on web filtering software found that the website for the now-defunct Montreal Urban Community (Communauté urbaine de Montréal, in French) was entirely blocked because its domain name was its French acronym CUM (www.cum.qc.ca).[4] In English, "cum" is a vulgar colloquial term for semen, ejaculation or achieving orgasm.[5]
  • In February 2004, in Scotland, Craig Cockburn reported that he was unable to use his surname (pronounced "Coburn") with Hotmail; Hotmail suggested spelling his name C0ckburn (with a zero instead of the letter "o"), but later reversed the ban.[6] In 2010 he had a similar problem registering on the BBC website where again the first four characters of his surname caused a problem for the content filter.[7]
  • In February 2006, Linda Callahan, a resident of Ashfield, Massachusetts, was initially prevented from registering her name with Yahoo! as an e-mail address as it contained the substring allah. Yahoo! later reversed the ban.[8]
  • In July 2008, Dr. Herman I. Libshitz was initially unable to get the e-mail address he wanted from Verizon because it contained the substring shit. The company later apologized and lifted the ban.[9]
  • In July 2011, VocalEyes CEO Matthew Cock posted on Twitter regarding the suspension of his Google+ and Facebook profiles because of his surname.[10]

Blocked web searches

  • In the months leading up to January 1996, some web searches for Super Bowl XXX were being filtered, because the Roman numeral for the game and the site (XXX) is also used to identify pornography.[11]
  • The filter of the free wireless service of the town of Whakatane in New Zealand blocked searches involving the town's own name, because the phonetic analysis used by the filter deemed the "whak" to sound like fuck.[12] The town name is Maori, and in the Maori language "wh" is most commonly pronounced as "f".
  • Gareth Roelofse noted in 2004, "We found many library Net stations, school networks and Internet cafes block sites with the word 'sex' in the domain name. This was a challenge for RomansInSussex.co.uk because its target audience is school children."[2]
  • In July 2011, web searches in China on the name Jiang were blocked following claims on the Sina Weibo microblogging site that former president Jiang Zemin had died. Since the word "jiang" meaning "river" is written with the same Chinese character, searches related to rivers including the Yangtze (Cháng Jiāng) produced the message "According to the relevant laws, regulations and policies, the results of this search cannot be displayed."[13]

Blocked emails

  • In 2001, Yahoo! Mail erroneously changed words, including medireview in place of medieval. This was due to an email filter which automatically replaced Javascript-related strings with alternate versions, to prevent the possibility of cross-site scripting in HTML email. The filter would hyphenate the terms "Javascript", "Jscript", "Vbscript" and "Livescript", and replaced "eval", "mocha" and "expression" with the similar but not quite synonymous terms "review", "espresso" and "statement", respectively. Assumptions were involved in the writing of the filters: no attempts were made to limit these string replacements to script sections and attributes, or to respect word boundaries, in case this would leave some loopholes open.[14][15][16]
  • In October 2004, it was reported that the Horniman Museum in London was failing to receive some of its e-mail because filters mistakenly decided that its name was a version of the words horny man.[17]
  • Problems can occur with the words socialism, socialist, and specialist because they contain the substring Cialis, the brand name for an erectile dysfunction medication commonly advertised in spam e-mails. Blocking of the word specialist is liable to block emailed résumés, curricula vitae and other material including job descriptions.[18]
  • In February 2003, members of Parliament at the British House of Commons found that a new spam filter was blocking e-mails to them. It blocked e-mails containing references to the Sexual Offences Bill then under debate, and some messages relating to a Liberal Democrat consultation paper on censorship.[19] It also blocked e-mails sent in Welsh because it did not recognise the language.[20]
  • Residents of the English towns of Penistone,[21] Lightwater[21] and Clitheroe have all been repeatedly inconvenienced because their towns' names include substrings (penis, twat, clit) regarded as offensive by filtering software.[22]

Blocked for word with two meanings

  • In May 2006, Ray Kennedy from Manchester in the UK found that e-mails that he had written to his local council to complain about a planning application had been blocked as they contained the word erection when referring to a structure.[23]
  • In October 2004, e-mails advertising the pantomime Dick Whittington sent by a teacher from Norwich in the UK were blocked by school computers because of the use of the word Dick, which is also used as a vulgar slang term meaning penis.[24]
  • Resumes of magna cum laude graduates have been blocked by spam filters because of inclusion of the word cum, which is Latin for with (in this usage) but is sometimes used as slang for semen in English usage.[25]
  • Blocked e-mails and web searches relating to The Beaver (based in Winnipeg) caused the publisher to change its name to Canada's History after 89 years of publication.[26] Publisher Deborah Morrison commented: "Back in 1920, The Beaver was a perfectly appropriate name. And while its other meaning is nothing new, its ambiguity began to pose a whole new challenge with the advance of the Internet. The name became an impediment to our growth".[21]
  • A councillor in Dudley found an email flagged for profanity by his council's security software after mentioning the Black Country dish, faggots.[27]
  • In 2007, the Royal Society for the Protection of Birds blocked ornithological terms such as cock (male bird), tit, shag and booby from its discussion forums.[28]

Modified content

Some websites have anti-obscenity filters which automatically replace offensive content with words intended to be equivalent in meaning.[29]

  • In June 2008, a news site run by the American Family Association filtered an Associated Press article on sprinter Tyson Gay, replacing instances of "gay" with "homosexual", thus rendering his name as "Tyson Homosexual".[30] Another article from the same agency, published the same month, similarly altered the name of basketball player Rudy Gay, naming him "Rudy Homosexual".[31]
  • Several websites running rudimentary obscenity filters have replaced the word "ass" with "butt", resulting in "clbuttic" for "classic" and "buttbuttinate" for "assassinate".[29][32]

Other instances

Blocked pages

  • In January 2014, files used in the online game League of Legends were reported as being blocked by some UK ISP filters due to the names 'VarusExpirationTimer.luaobj' and 'XerathMageChainsExtended.luaobj' containing the letters used in the word "sex".[34]

See also

References

  1. Lua error in package.lua at line 80: module 'strict' not found.
  2. 2.0 2.1 Lua error in package.lua at line 80: module 'strict' not found.
  3. Lua error in package.lua at line 80: module 'strict' not found.
  4. Lua error in package.lua at line 80: module 'strict' not found.
  5. Lua error in package.lua at line 80: module 'strict' not found. Cross-referenced from cum: "Slang come (): somewhat vulgar"
  6. Lua error in package.lua at line 80: module 'strict' not found.
  7. Lua error in package.lua at line 80: module 'strict' not found.
  8. Lua error in package.lua at line 80: module 'strict' not found.
  9. Lua error in package.lua at line 80: module 'strict' not found.
  10. Lua error in package.lua at line 80: module 'strict' not found.
  11. Lua error in package.lua at line 80: module 'strict' not found.
  12. Lua error in package.lua at line 80: module 'strict' not found.
  13. Lua error in package.lua at line 80: module 'strict' not found.
  14. Lua error in package.lua at line 80: module 'strict' not found.
  15. Lua error in package.lua at line 80: module 'strict' not found.
  16. Lua error in package.lua at line 80: module 'strict' not found.
  17. Lua error in package.lua at line 80: module 'strict' not found.
  18. Lua error in package.lua at line 80: module 'strict' not found.
  19. Lua error in package.lua at line 80: module 'strict' not found.
  20. Lua error in package.lua at line 80: module 'strict' not found.
  21. 21.0 21.1 21.2 Lua error in package.lua at line 80: module 'strict' not found.
  22. Lua error in package.lua at line 80: module 'strict' not found.
  23. Lua error in package.lua at line 80: module 'strict' not found.
  24. Lua error in package.lua at line 80: module 'strict' not found.
  25. Lua error in package.lua at line 80: module 'strict' not found.
  26. Lua error in package.lua at line 80: module 'strict' not found.
  27. Lua error in package.lua at line 80: module 'strict' not found.
  28. Lua error in package.lua at line 80: module 'strict' not found.
  29. 29.0 29.1 Lua error in package.lua at line 80: module 'strict' not found.
  30. Lua error in package.lua at line 80: module 'strict' not found.
  31. Lua error in package.lua at line 80: module 'strict' not found.
  32. Lua error in package.lua at line 80: module 'strict' not found.
  33. Lua error in package.lua at line 80: module 'strict' not found.
  34. Lua error in package.lua at line 80: module 'strict' not found.

External links