请输入您要查询的百科知识:

 

词条 Scrapy
释义

  1. History

  2. References

  3. External links

{{distinguish|Scrapie}}{{Infobox software
| name = Scrapy
| logo = File:Scrapy logo.jpg
| screenshot =
| caption =
| collapsible =
| author =
| developer = Scrapinghub, Ltd.
| released = {{Start date|2008|06|26|df=yes}}
| discontinued =
| latest release version = 1.6.0
| latest release date = {{Start date and age|2019|01|30|df=yes}}[1]
| latest preview version =
| latest preview date =
| programming language = Python
| operating system = Windows, macOS, Linux
| platform =
| size =
| language =
| genre = Web crawler
| license = BSD License
}}

Scrapy ({{IPAc-en|ˈ|s|k|r|eI|p|i}} {{respell|SKRAY|pee}})[2] is a free and open-source web-crawling framework written in Python. Originally designed for web scraping, it can also be used to extract data using APIs or as a general-purpose web crawler.[3] It is currently maintained by Scrapinghub Ltd., a web-scraping development and services company.

Scrapy project architecture is built around "spiders", which are self-contained crawlers that are given a set of instructions. Following the spirit of other don't repeat yourself frameworks, such as Django,[4] it makes it easier to build and scale large crawling projects by allowing developers to reuse their code. Scrapy also provides a web-crawling shell, which can be used by developers to test their assumptions on a site’s behavior.[5]

Some well-known companies and products using Scrapy are: Lyst,[6] [7] Parse.ly,[8] Sayone Technologies[9], Sciences Po Medialab,[10] Data.gov.uk’s World Government Data site.[11][https://www.sayonetech.com/services/data-scraping/]

History

Scrapy was born at London-based web-aggregation and e-commerce company Mydeco, where it was developed and maintained by employees of Mydeco and Insophia (a web-consulting company based in Montevideo, Uruguay). The first public release was in August 2008 under the BSD license, with a milestone 1.0 release happening in June 2015.[12] In 2011, Scrapinghub became the new official maintainer.[13][14]

References

1. ^{{Cite web|url=https://doc.scrapy.org/en/latest/news.html|title=Release notes — Scrapy documentation|website=doc.scrapy.org|language=en|access-date=2019-02-15}}
2. ^[https://groups.google.com/forum/#!topic/scrapy-users/tA_1T8du_WU How do you pronounce "Scrapy"?]
3. ^Scrapy at a glance.
4. ^{{ cite web | url= http://doc.scrapy.org/en/latest/faq.html#did-scrapy-steal-x-from-django | title= Frequently Asked Questions | access-date= 28 July 2015 }}
5. ^{{ cite web | url= http://doc.scrapy.org/en/latest/topics/shell.html | title = Scrapy shell | access-date= 28 July 2015}}
6. ^{{ cite web | url= http://talks.lystit.com/dsl-scraping-presentation/#/4 | title=Scalable Scraping Using Machine Learning |first1=Eddie|last1=Bell|first2=Jonathan|last2=Heusser | access-date= 28 July 2015}}
7. ^Scrapy | Companies using Scrapy
8. ^{{ cite web | url=https://speakerdeck.com/amontalenti/web-crawling-and-metadata-extraction-in-python| title=Web Crawling & Metadata Extraction in Python| first= Andrew | last=Montalenti}}
9. ^{{Cite web |url=https://scrapy.org/companies/ |title=Scrapy Companies |last= |first= |date= |website=Scrapy website |archive-url= |archive-date= |dead-url= |access-date=}}
10. ^Hyphe v0.0.0: the first release of our new webcrawler is out!
11. ^{{Cite tweet |user=bfirsh |author=Ben Firshman |number=8025368963 |date = 21 January 2010 |title=World Govt Data site uses Django, Solr, Haystack, Scrapy and other exciting buzzwords http://bit.ly/5jU3La #opendata #datastore }}
12. ^{{cite mailing list |url=https://groups.google.com/forum/#!topic/scrapy-users/sMbBVIq0sko | title= Scrapy 1.0 official release out! |mailing-list=scrapy-users|last=Medina |first=Julia |date=19 June 2015}}
13. ^{{cite book |author=Pablo Hoffman |title=List of the primary authors & contributors |url=https://github.com/scrapy/scrapy/blob/master/AUTHORS |accessdate=18 November 2013 |year=2013}}
14. ^Interview Scraping Hub.

External links

  • {{Official website}}

4 : Web crawlers|Web scraping|Free software programmed in Python|Software using the BSD license

随便看

 

开放百科全书收录14589846条英语、德语、日语等多语种百科知识,基本涵盖了大多数领域的百科知识,是一部内容自由、开放的电子版国际百科全书。

 

Copyright © 2023 OENC.NET All Rights Reserved
京ICP备2021023879号 更新时间:2024/11/10 14:54:36