词条 | Scrapy |
释义 |
| name = Scrapy | logo = File:Scrapy logo.jpg | screenshot = | caption = | collapsible = | author = | developer = Scrapinghub, Ltd. | released = {{Start date|2008|06|26|df=yes}} | discontinued = | latest release version = 1.6.0 | latest release date = {{Start date and age|2019|01|30|df=yes}}[1] | latest preview version = | latest preview date = | programming language = Python | operating system = Windows, macOS, Linux | platform = | size = | language = | genre = Web crawler | license = BSD License }} Scrapy ({{IPAc-en|ˈ|s|k|r|eI|p|i}} {{respell|SKRAY|pee}})[2] is a free and open-source web-crawling framework written in Python. Originally designed for web scraping, it can also be used to extract data using APIs or as a general-purpose web crawler.[3] It is currently maintained by Scrapinghub Ltd., a web-scraping development and services company. Scrapy project architecture is built around "spiders", which are self-contained crawlers that are given a set of instructions. Following the spirit of other don't repeat yourself frameworks, such as Django,[4] it makes it easier to build and scale large crawling projects by allowing developers to reuse their code. Scrapy also provides a web-crawling shell, which can be used by developers to test their assumptions on a site’s behavior.[5] Some well-known companies and products using Scrapy are: Lyst,[6] [7] Parse.ly,[8] Sayone Technologies[9], Sciences Po Medialab,[10] Data.gov.uk’s World Government Data site.[11][https://www.sayonetech.com/services/data-scraping/] HistoryScrapy was born at London-based web-aggregation and e-commerce company Mydeco, where it was developed and maintained by employees of Mydeco and Insophia (a web-consulting company based in Montevideo, Uruguay). The first public release was in August 2008 under the BSD license, with a milestone 1.0 release happening in June 2015.[12] In 2011, Scrapinghub became the new official maintainer.[13][14] References1. ^{{Cite web|url=https://doc.scrapy.org/en/latest/news.html|title=Release notes — Scrapy documentation|website=doc.scrapy.org|language=en|access-date=2019-02-15}} 2. ^[https://groups.google.com/forum/#!topic/scrapy-users/tA_1T8du_WU How do you pronounce "Scrapy"?] 3. ^Scrapy at a glance. 4. ^{{ cite web | url= http://doc.scrapy.org/en/latest/faq.html#did-scrapy-steal-x-from-django | title= Frequently Asked Questions | access-date= 28 July 2015 }} 5. ^{{ cite web | url= http://doc.scrapy.org/en/latest/topics/shell.html | title = Scrapy shell | access-date= 28 July 2015}} 6. ^{{ cite web | url= http://talks.lystit.com/dsl-scraping-presentation/#/4 | title=Scalable Scraping Using Machine Learning |first1=Eddie|last1=Bell|first2=Jonathan|last2=Heusser | access-date= 28 July 2015}} 7. ^Scrapy | Companies using Scrapy 8. ^{{ cite web | url=https://speakerdeck.com/amontalenti/web-crawling-and-metadata-extraction-in-python| title=Web Crawling & Metadata Extraction in Python| first= Andrew | last=Montalenti}} 9. ^{{Cite web |url=https://scrapy.org/companies/ |title=Scrapy Companies |last= |first= |date= |website=Scrapy website |archive-url= |archive-date= |dead-url= |access-date=}} 10. ^Hyphe v0.0.0: the first release of our new webcrawler is out! 11. ^{{Cite tweet |user=bfirsh |author=Ben Firshman |number=8025368963 |date = 21 January 2010 |title=World Govt Data site uses Django, Solr, Haystack, Scrapy and other exciting buzzwords http://bit.ly/5jU3La #opendata #datastore }} 12. ^{{cite mailing list |url=https://groups.google.com/forum/#!topic/scrapy-users/sMbBVIq0sko | title= Scrapy 1.0 official release out! |mailing-list=scrapy-users|last=Medina |first=Julia |date=19 June 2015}} 13. ^{{cite book |author=Pablo Hoffman |title=List of the primary authors & contributors |url=https://github.com/scrapy/scrapy/blob/master/AUTHORS |accessdate=18 November 2013 |year=2013}} 14. ^Interview Scraping Hub. External links
4 : Web crawlers|Web scraping|Free software programmed in Python|Software using the BSD license |
随便看 |
|
开放百科全书收录14589846条英语、德语、日语等多语种百科知识,基本涵盖了大多数领域的百科知识,是一部内容自由、开放的电子版国际百科全书。