请输入您要查询的百科知识:

 

词条 Scunthorpe problem
释义

  1. Origin and history

  2. Other examples

      Censored voice search results    Refused web domain names and account registrations    Blocked web searches    Blocked emails    Blocked for words with two meanings    News articles damaged    Other  

  3. See also

  4. References

  5. External links

{{Use dmy dates|date=May 2012}}

The Scunthorpe problem is the blocking of websites, e-mails, forum posts or search results by a spam filter or search engine because their text contains a string of letters that appear to have an obscene or unacceptable meaning. Names, abbreviations, and technical terms are most often cited as being affected by the issue.

The problem arises since computers can easily identify strings of text within a document, but interpreting words of this kind requires considerable ability to interpret a wide range of contexts, possibly across many cultures, which is an extremely difficult task. As a result, broad blocking rules may result in false positives affecting innocent phrases.

Origin and history

The problem was named after an incident in 1996 in which AOL's profanity filter prevented residents of the town of Scunthorpe, North Lincolnshire, England from creating accounts with AOL, because the town's name contains the substring "cunt".[1]

Years later, Google's opt-in SafeSearch filters apparently made the same mistake, preventing residents from searching for local businesses that included Scunthorpe in their names.[2]

Other examples

Mistaken decisions by obscenity filters include:

Censored voice search results

  • In 2010, it was reported that terms such as 'lolita' (as part of the search term 'Nabokov lolita', a search term for Vladimir Nabokov's Lolita), 'lolicon', 'incest', and 'whorehouse' (as part of the musical-with-a-book The Best Little Whorehouse in Texas's title) were hashed{{huh|date=February 2019}} on Google's voice search feature.[3][4] Although the aforementioned terms were hashed on voice search, it was reported that 'bestiality' and Lady Chatterley's Lover were left uncensored.[5]

Refused web domain names and account registrations

  • In April 1998, Jeff Gold attempted to register the domain name shitakemushrooms.com, but he was blocked by an InterNIC filter prohibiting the "seven dirty words" which was active between 1996 and the transfer of control to ICANN in 1998.[6] (Shitake is from the Japanese name for the edible fungus Lentinula edodes.)
  • In 2000, a Canadian television news story on web filtering software found that the website for the Montreal Urban Community (Communauté urbaine de Montréal, in French) was entirely blocked because its domain name was its French acronym CUM (www.cum.qc.ca);[7] "cum" (among other meanings) is English-language slang for semen.
  • In February 2004 in Scotland, Craig Cockburn reported that he was unable to use his surname (pronounced "Coburn") with Hotmail. Separately he had problems with his workplace email because of the name of a pharmaceutical, that was often the subject line used on spam or scam emails, being cialis, occurring within his job title of software specialist. He was told by Hotmail to spell his name C0ckburn (with a zero instead of the letter "o"); Hotmail later reversed the ban.[8] In 2010 he had a similar problem registering on the BBC site where again the first four characters of his surname caused a problem for the content filter.[9]
  • In February 2006, Linda Callahan, a resident of Ashfield, Massachusetts, was initially prevented from registering her name with Yahoo! as an e-mail address as it contained the substring Allah. Yahoo! later reversed the ban.[10]
  • In July 2008, Dr. Herman I. Libshitz could not register an e-mail address containing his name from Verizon because his surname contained the substring shit, and Verizon initially rejected his request for an exception. In a subsequent statement, a Verizon spokeswoman apologized for not approving his desired e-mail address.[11]
  • In August 2018, Natalie Weiner reported on social media that she was unable to create an account for herself on a website, because her last name is also a word used as slang for penis. It was reported that "hundreds" of people replied saying this also impacted them as well. Names of those replying included Ben Schmuck (last name is a Yiddish word for "penis"), and Arun Dikshit (last name is Sanskrit for one who teaches or provides knowledge, containing the substring shit).[12][13][14] Articles covering this stated that it was a common and extremely difficult technical problem for which no robust solution was currently available.[12]

Blocked web searches

  • In the months leading up to January 1996, some web searches for Super Bowl XXX were being filtered, because the Roman numeral for the game and the site (XXX) is also used to identify pornography.[15]
  • Gareth Roelofse, the web designer for RomansInSussex.com, noted in 2004, "We found many library Net stations, school networks and Internet cafes block sites with the word 'sex' in the domain name. This was a challenge for RomansInSussex.co.uk because its target audience is school children."[2]
  • In 2008, the filter of the free wireless service of the town of Whakatane in New Zealand blocked searches involving the town's own name because the filter's phonetic analysis deemed the "whak" to sound like fuck; the town name is in Maori, and in the Maori language "wh" is most commonly pronounced as "f". The town subsequently put the name on the filter's whitelist.[16]
  • In July 2011, web searches in China on the name Jiang were blocked following claims on the Sina Weibo microblogging site that former president Jiang Zemin had died. Since the word "Jiang" meaning "river" is written with the same Chinese character (江), searches related to rivers including the Yangtze (Cháng Jiāng) produced the message "According to the relevant laws, regulations and policies, the results of this search cannot be displayed."[17]
  • In February 2018, web searches on Google's shopping platform were blocked for items such as glue guns, Guns N%27 Roses, and Burgundy after Google hastily patched their search system that was displaying results for weapons and accessories that violated Google's stated policies.[18]

Blocked emails

  • In 2001, Yahoo! Mail erroneously changed words, including medireview in place of medieval. That year Yahoo! introduced an email filter which automatically replaced JavaScript-related strings with alternative versions, to prevent the possibility of JavaScript viruses in HTML email. The filter would hyphenate the terms "Javascript", "Jscript", "Vbscript" and "Livescript", and replaced "eval", "mocha" and "expression" with the similar but not quite synonymous terms "review", "espresso" and "statement", respectively. Assumptions were involved in the writing of the filters: no attempts were made to limit these string replacements to script sections and attributes, or to respect word boundaries, in case this would leave some loopholes open.[19][20][21]
  • In February 2003, Members of Parliament at the British House of Commons found that a new spam filter was blocking e-mails to them. It blocked e-mails containing references to the Sexual Offences Bill then under debate, as well as some messages relating to a Liberal Democrat consultation paper on censorship.[22] It also blocked e-mails sent in Welsh because it did not recognise the language.[23]
  • In October 2004, it was reported that the Horniman Museum in London was failing to receive some of its e-mail because filters mistakenly treated its name as a version of the words horny man.[24]
  • Problems can occur with the words socialism, socialist, and specialist because they contain the substring Cialis, the brand name for an erectile dysfunction medication commonly advertised in spam e-mails. Blocking of the word specialist is liable to block emailed résumés and curricula vitae and other material including job descriptions.[25]

Blocked for words with two meanings

  • In October 2004, e-mails advertising the pantomime Dick Whittington sent by a teacher from Norwich in the UK were blocked by school computers because of the use of the word Dick, sometimes used as slang for penis.[26]
  • In May 2006, Ray Kennedy from Manchester in the UK found that e-mails that he had written to his local council to complain about a planning application had been blocked as they contained the word erection when referring to a structure.[27]
  • In 2007, the Royal Society for the Protection of Birds blocked ornithological terms such as cock (male bird) and tit (a small songbird), shag and booby (types of seabird) from its discussion forums.[28]
  • Blocked e-mails and web searches relating to The Beaver, a magazine based in Winnipeg, caused the publisher to change its name to Canada's History in 2010, after 89 years of publication.[29] Publisher Deborah Morrison commented: "Back in 1920, The Beaver was a perfectly appropriate name. And while its other meaning [vagina] is nothing new, its ambiguity began to pose a whole new challenge with the advance of the Internet. The name became an impediment to our growth".[30]
  • In 2011, a councillor in Dudley found an email flagged for profanity by his council's security software after mentioning the Black Country dish, faggots.[31]
  • Residents of Penistone in South Yorkshire have had e-mails blocked because the town's name includes the substring penis.[32]
  • Lightwater in Surrey suffered similarly because its name contains the substring twat.
  • Residents of Clitheroe (Lancashire, England) have been repeatedly inconvenienced because their town's name includes the substring clit, which is short for "clitoris".[33]
  • Resumes of magna cum laude graduates have been blocked by spam filters because of inclusion of the word cum, which is Latin for with (in this usage), but is sometimes used as slang for semen in English usage.[34]

News articles damaged

  • In June 2008, a news site run by the American Family Association filtered an Associated Press article on sprinter Tyson Gay, replacing instances of "gay" with "homosexual", thus rendering his name as "Tyson Homosexual".[35]
  • In December 2011, it was reported that software used by Virgin Media had filtered words including "Arsenal" (for "arse"), and "Canal" (for "anal").[36]
  • The word or string "ass" may be replaced by "butt", resulting in "clbuttic" for "classic" and "buttbuttinate" for "assassinate".[37]

Other

  • In November 2013, British Facebook temporarily blocked users for using the word faggot in reference to the dish.[38]
  • In January 2014, files used in the online game League of Legends were reported as being blocked by some UK ISP filters due to the names 'VarusExpirationTimer.luaobj' and 'XerathMageChainsExtended.luaobj' containing the letters used in the word "sex".[39]
  • In May 2018, the website of the grocery store Publix would not allow a cake to be ordered containing the Latin phrase "Summa Cum Laude". The customer attempted to rectify the problem by including special instructions but still ended up with a cake reading "Summa --- Laude".[40][41]

See also

{{Portal|Internet}}
  • Censorship by Google
  • Cupertino effect
  • Spam detection
  • Wordfilter
  • False Positive

References

1. ^{{cite journal |url=http://catless.ncl.ac.uk/Risks/18.07.html#subj3 |title=AOL censors British town's name! |journal=The Risks Digest |date=25 April 1996 |author=Clive Feather |editor=Peter G. Neumann}}
2. ^{{Cite web |url=http://news.cnet.com/2100-1032_3-5198125.html |title=Google's chastity belt too tight |last=Declan McCullagh |date=23 April 2004 |website=CNET |archive-url=https://web.archive.org/web/20110616041821/http://news.cnet.com/2100-1032_3-5198125.html |archive-date=16 June 2011 |access-date=2 September 2018}}
3. ^{{cite web |last1=Matyszczyk |first1=Chris |title=Google censors 'lolita' but not 'bestiality' |url=https://www.cnet.com/news/google-censors-lolita-but-not-bestiality/ |website=CNET News |publisher=CNET |accessdate=September 1, 2018}}
4. ^{{cite web |author=Jura |title=Google censors lolicon sites |url=http://animegerad.com/google-censors-lolicon-sites/ |website=Anime Gerad |accessdate=September 1, 2018|archiveurl=https://web.archive.org/web/20100422144110/http://animegerad.com/google-censors-lolicon-sites/ |archivedate=April 22, 2010|language=English}}
5. ^{{cite web |last1=Matyszczyk |first1=Chris |title=Google censors 'lolita' but not 'bestiality' |url=https://www.cnet.com/news/google-censors-lolita-but-not-bestiality/ |website=CNET News |publisher=CNET |accessdate=September 1, 2018}}
6. ^{{cite web |url=https://www.cnet.com/news/food-domain-found-obscene/ |title=Food domain found "obscene" |publisher=News.com |date=27 April 1998 |author=Paul Festa}}
7. ^{{cite web |url=http://www.radio-canada.ca/branche/v6/faq.html |title=Foire aux questions |publisher=radio-canada.ca |accessdate=24 February 2011 |deadurl=yes |archiveurl=https://web.archive.org/web/20121021033232/http://www.radio-canada.ca/branche/v6/faq.html |archivedate=21 October 2012 |df=dmy-all }}
8. ^{{cite web |url=http://www.smh.com.au/cgi-bin/common/popupPrintArticle.pl?path=/articles/2004/02/26/1077676867921.html |title=How Mr C0ckburn fought spam |work=Sydney Morning Herald |first1=Garry |last1=Barker |date=26 February 2004 |accessdate=24 February 2011 |deadurl=yes |archiveurl=https://web.archive.org/web/20090903173222/http://www.smh.com.au/cgi-bin/common/popupPrintArticle.pl?path=%2Farticles%2F2004%2F02%2F26%2F1077676867921.html |archivedate=3 September 2009 |df=dmy-all }}
9. ^{{cite web |url=http://blog.siliconglen.com/2010/03/bbc-fail-my-correct-name-is-not.html#/ |title=BBC fail – my correct name is not permitted |first1=Craig |last1=Cockburn |publisher=blog.siliconglen.com |date=9 March 2010 |accessdate=24 February 2011 }}
10. ^{{cite web |url=http://kallahar.com/stories/2005-Yahoo/yahoo.php |title=Is Yahoo Banning Allah? |publisher=Kallahar's Place |accessdate=24 February 2011|archiveurl=https://web.archive.org/web/20160114064713/http://kallahar.com/stories/2005-Yahoo/yahoo.php|archivedate=January 14, 2016}}
11. ^{{cite web|url=http://www.philly.com/philly/hp/news_update/26089374.html|title=When your name gets turned against you|accessdate=3 August 2008 |archiveurl = https://web.archive.org/web/20080805004344/http://www.philly.com/philly/hp/news_update/26089374.html |archivedate = 5 August 2008}}
12. ^{{Cite web | url=https://www.slashdot.org/story/345228 | title=The 'Scunthorpe Problem' Has Never Really Been Solved - Slashdot}}
13. ^https://twitter.com/natalieweiner/status/1034533245839450113
14. ^https://twitter.com/ArunDickshit/status/1034676077451337729
15. ^"E-Rate And Filtering: A Review Of The Children's Internet Protection Act". Congressional Hearings. General. Energy and Commerce, Subcommittee on Telecommunications and the Internet. April 4, 2001.
16. ^{{cite news|url=http://www.switched.com/2008/08/01/town-censors-its-name/|title=F-Word Town's Name Gets Censored By Internet Filter|accessdate=27 July 2011|deadurl=bot: unknown|archiveurl=https://web.archive.org/web/20081201200338/http://www.switched.com/2008/08/01/town-censors-its-name/|archivedate=1 December 2008|df=dmy-all}}
17. ^{{cite news|url=https://blogs.wsj.com/chinarealtime/2011/07/06/following-jiang-death-rumors-chinas-rivers-go-missing/|title=Following Jiang Death Rumors, China’s Rivers Go Missing|last=Chin|first=Josh|date=6 July 2011|work=The Wall Street Journal|accessdate=7 July 2011}}
18. ^{{cite news|url=https://www.telegraph.co.uk/news/2018/02/27/wine-lovers-cannot-buy-burgundy-tipple-google-internet-giant |title=Wine lovers cannot buy Burgundy tipple on Google as internet giant cracks down on 'gun' searches |accessdate=27 February 2018|archiveurl=https://web.archive.org/web/20180302161934/https://www.telegraph.co.uk/news/2018/02/27/wine-lovers-cannot-buy-burgundy-tipple-google-internet-giant/|archivedate=2 March 2018|dead-url=no}}
19. ^{{cite news|url=http://news.bbc.co.uk/2/hi/science/nature/2138014.stm |accessdate=2013-06-21 |date=19 July 2002 |title=Yahoo admits mangling e-mail |publisher=BBC News}}
20. ^{{cite web |url=http://www.ntk.net/2002/07/12/#HARD_NEWS |accessdate=2013-06-21 |date=12 July 2002 |title=Hard news |work=Need To Know 2002-07-12}}
21. ^{{cite news |url=https://www.newscientist.com/article/dn2546-email-security-filter-spawns-new-words.html |accessdate=2013-06-21 |last=Knight |first=Will |date=15 July 2002 |title=Email security filter spawns new words |publisher=New Scientist}}
22. ^BBC E-mail vetting blocks MPs' sex debate 4 February 2003
23. ^BBC Software blocks MPs' Welsh e-mail 5 February 2003
24. ^{{cite web |url=http://www.newsshopper.co.uk/news/533121.Name_of_museum_is_confused_with_porn/ |title=Name of museum is confused with porn |publisher=News Shopper |date=5 October 2004 |first1=Adrian |last1=Kwintner |accessdate=24 February 2011 |df=dmy-all }}
25. ^{{cite web |url=http://www.pekingduck.org/2004/11/comment-headaches/ |title=Comment headaches |date=21 November 2004 |publisher=The Peking Duck |accessdate=24 February 2011 }}
26. ^Sam Jones [https://www.theguardian.com/uk/2004/oct/14/schools.primaryeducation Panto email falls foul of filth filter] The Guardian 14 October 2004
27. ^BBC E-mail filter blocks 'erection' 30 May 2006
28. ^{{cite news|title=The word 'cock' is banned on RSPB's website|url=http://www.dailymail.co.uk/news/article-459183/The-word-cock-banned-RSPBs-website.html|accessdate=13 November 2012|newspaper=Daily Mail|date=13 November 2012}}
29. ^{{cite news|url=http://www.google.com/hostednews/afp/article/ALeqM5hJrLjBHj_y8G7OFPAZZmOhUL10Bw|title=Canada's The Beaver magazine renamed to end porn mix-up|agency=Agence France-Presse|date=12 January 2010|accessdate=12 January 2010|deadurl=yes|archiveurl=https://web.archive.org/web/20140305085824/http://www.google.com/hostednews/afp/article/ALeqM5hJrLjBHj_y8G7OFPAZZmOhUL10Bw|archivedate=5 March 2014|df=dmy-all}}
30. ^{{cite news|url=http://news.bbc.co.uk/2/hi/technology/8528672.stm|title=How spam filters dictated Canadian magazine's fate|last=Sheerin|first=Jude|date=29 March 2010|publisher=BBC News|accessdate=29 March 2010}}
31. ^{{cite news|title=Black Country Councillor Caught up in Faggots Farce|url=https://www.birminghammail.co.uk/news/local-news/black-country-councillor-caught-up-149100|accessdate=24 February 2011|newspaper=Birmingham Mail|date=24 February 2011}}
32. ^{{cite news|url=https://www.theguardian.com/books/2013/apr/17/tom-chatfield-top-10-internet-neologisms|title=The 10 best words the internet has given English|author=Tom Chatfield|date=17 April 2013|work=the Guardian|accessdate=4 February 2018}}
33. ^{{cite book|last=Keyes|first=Ralph |title=Unmentionables: From Family Jewels to Friendly Fire – What We Say Instead of What We Mean|url=https://books.google.com/books?id=3yj_IdO1zPEC&pg=PT15|accessdate=21 June 2013|year=2010|publisher=John Murray|isbn=978-1-84854-456-7}}
34. ^{{cite web |url=http://www.collegejournal.com/jobhunting/resumeadvice/20040426-maher.html |title=Don't Let Spam Filters Snatch Your Resume |accessdate=11 February 2008 |work=Career Journal |last=Maher |first=Kris |deadurl=yes |archiveurl=https://web.archive.org/web/20061023111709/http://www.collegejournal.com/jobhunting/resumeadvice/20040426-maher.html |archivedate=23 October 2006 |df=dmy-all }}
35. ^{{cite web|url=http://boingboing.net/2008/06/30/homophobic-news-site.html|title=Homophobic news site changes athlete Tyson Gay to Tyson Homosexual|last=Frauenfelder|first=Mark|date=30 June 2008|publisher=BoingBoing|accessdate=22 December 2008}}
36. ^{{cite news|title=What the D***ens is going on? Over-zealous censors filter out favourite TV names (and don't even think of watching an Arsenal game|url=http://www.dailymail.co.uk/news/article-2076336/Virgin-Media-censors-offensive-D--ens-A--nal-Hitchc--zealous-blunder.html|accessdate=20 December 2011|newspaper=Daily Mail|author=Gye, Hugo|date=20 December 2011}}
37. ^{{cite news|url=https://www.telegraph.co.uk/news/newstopics/howaboutthat/2667634/The-Clbuttic-Mistake-When-obscenity-filters-go-wrong.html|title=The Clbuttic Mistake: When obscenity filters go wrong|last=Moore|first=Matthew|date=2 September 2008|work=The Daily Telegraph |accessdate=4 April 2010 | location=London}}
38. ^{{cite web|url=https://www.expressandstar.com/news/2013/11/01/faggots-and-peas-fall-foul-of-facebook-censors/|title=Faggots and peas fall foul of Facebook censors|date=November 2013|work=Express and Star}}
39. ^{{cite news|url=https://www.theguardian.com/technology/2014/jan/21/uk-porn-filter-blocks-game-update-that-contained-sex?CMP=twt_gu|title=UK porn filter blocks game update that contained 'sex'|last=Gibbs|first=Samuel|date=21 January 2014|newspaper=The Guardian|accessdate=21 January 2014 |location=London}}
40. ^{{Cite news|url=https://www.washingtonpost.com/news/morning-mix/wp/2018/05/22/proud-mom-orders-summa-cum-laude-cake-online-publix-censors-it-to-summa-laude/|title=Proud mom orders ‘Summa Cum Laude’ cake online. Publix censors it: Summa … Laude.|last=Ferguson|first=Amber|date=2018-05-22|work=Washington Post|access-date=2018-05-22|language=en-US|issn=0190-8286}}
41. ^{{cite news| url=https://www.huffingtonpost.com/entry/publix-censors-teens-summa-cum-laude-graduation-cake_us_5b042efbe4b003dc7e46a54a |title=Publix Censors Teen’s ‘Summa Cum Laude’ Graduation Cake |work=Huffington post |last=Amatulli |first=Jenna |date=22 May 2018}}

External links

  • Article on autocorrecting from The Guardian
{{Profanity}}{{DEFAULTSORT:Scunthorpe Problem}}

6 : AOL|Internet censorship|Profanity|Scunthorpe|Software bugs|Spam filtering

随便看

 

开放百科全书收录14589846条英语、德语、日语等多语种百科知识,基本涵盖了大多数领域的百科知识,是一部内容自由、开放的电子版国际百科全书。

 

Copyright © 2023 OENC.NET All Rights Reserved
京ICP备2021023879号 更新时间:2024/9/20 14:45:13