Slashdot Mirror


Python 3 Is Coming To Scrapy (scrapinghub.com)

New submitter Valdir Stumm Junior writes: Scrapy with beta Python 3 support is finally here! Released through Scrapy 1.1.0rc1, this is the result of several months of hard work on the part of the Scrapy community and Scrapinghub engineers.

This is a huge milestone for all you Scrapy users (and those who haven't used Scrapy due to the lack of Python 3). Scrapy veterans and new adopters will soon be able to move their entire stack to Python 3 once the release becomes stable. Keep in mind that since this a release candidate, it is not ready to be used in production.

87 comments

  1. What the fuck is Scrapy? by Anonymous Coward · · Score: 5, Insightful

    What the fuck is Scrapy?

    1. Re:What the fuck is Scrapy? by vux984 · · Score: 3, Funny

      What the fuck is Scrapy?

      Pretty sure it's that irritating dog they added to scooby doo to try to inject new life into it but which only made it worse.

    2. Re:What the fuck is Scrapy? by pem · · Score: 2

      It's the sheep equivalent of mad cow disease.

    3. Re:What the fuck is Scrapy? by SeaFox · · Score: 1

      I thought they were talking about a Debian or Ubuntu release but the alphabetizing of version numbers is off.

    4. Re:What the fuck is Scrapy? by Anonymous Coward · · Score: 0

      Dude, keep up please. It's some shit or another.

    5. Re:What the fuck is Scrapy? by Darinbob · · Score: 2

      I had scrapy once but the ointment cleared it right up.

    6. Re:What the fuck is Scrapy? by Darinbob · · Score: 1

      That was Scrappy. Scrappy, Scrappy Poo, what are you?

    7. Re:What the fuck is Scrapy? by Anonymous Coward · · Score: 5, Informative

      Scrapy (/skrepi/ SKRAY-pee) is a free and open source web crawling framework, written in Python. Originally designed for web scraping, it can also be used to extract data using APIs or as a general purpose web crawler.

    8. Re:What the fuck is Scrapy? by Applehu+Akbar · · Score: 1

      What the fuck is Scrapy?

      A rather ugly prion disease. It's like naming your new application Smegma or Pus.

    9. Re:What the fuck is Scrapy? by Anonymous Coward · · Score: 0

      What the fuck is Scrapy?

      A rather ugly prion disease. It's like naming your new application Smegma or Pus.

      Like the Samsung Galaxy Pus
      or the Kim Dotcom's Smegma Downloads?

    10. Re:What the fuck is Scrapy? by squiggleslash · · Score: 3, Funny

      It's an collaborative cloud-ready framework that leverages Python based open source technologies to extract data across multiple standards based web sites.

      (Wow, actually looking at their website, I got it right, and I made that up initially.)

      --
      You are not alone. This is not normal. None of this is normal.
    11. Re: What the fuck is Scrapy? by Anonymous Coward · · Score: 0

      I think I just won buzzword bingo with your post!

    12. Re:What the fuck is Scrapy? by Anonymous Coward · · Score: 0

      Please tell me of the synergies I can hope to achieve?

    13. Re:What the fuck is Scrapy? by invictusvoyd · · Score: 1

      What the fuck is Scrapy?

      You are a geek . You read the word scrapy along with the word python. The word is strikingly similar to the word scrape. You asked the above question.

      You are not a geek

    14. Re:What the fuck is Scrapy? by Anonymous Coward · · Score: 0

      Nephew of Scoby

    15. Re:What the fuck is Scrapy? by Anonymous Coward · · Score: 0

      "Scapy is a powerful interactive packet manipulation program. It is able to forge or decode packets of a wide number of protocols, send them on the wire, capture them, match requests and replies, and much more. It can easily handle most classical tasks like scanning, tracerouting, probing, unit tests, attacks or network discovery (it can replace hping, 85% of nmap, arpspoof, arp-sk, arping, tcpdump, tethereal, p0f, etc.). It also performs very well at a lot of other specific tasks that most other tools can't handle, like sending invalid frames, injecting your own 802.11 frames, combining technics..." But the story must be late because they already released version 2.3.2 but I am not seeing anything to indicate Python 3 support??? Oh Scrapy, wait what... Scrapy... wtf is that?

      Caption: mutiny

    16. Re:What the fuck is Scrapy? by Bitbeisser · · Score: 1

      Scrapy (/skrepi/ SKRAY-pee) is a free and open source web crawling framework, written in Python. Originally designed for web scraping, it can also be used to extract data using APIs or as a general purpose web crawler.

      And their tagline is "Scapy is crappy"?

    17. Re:What the fuck is Scrapy? by Anonymous Coward · · Score: 0

      What the fuck is Scrapy?

      A disease sailors get on long voyages. Recently astronauts and cosmonauts have also suffered from Scapy. Terrible disease. Terrible outcomes. The poor voyager get itchy all over from lack of free space.

      Scapy should not be confused with Scurvy or Scabies.

    18. Re:What the fuck is Scrapy? by WallyL · · Score: 1

      Mad cow disease is for cows! You are all cows, moo?

  2. WTF Is Scrapy? by Anonymous Coward · · Score: 5, Insightful

    Jesus christ I hate submissions like this. Not only is it a blatent product-pimping ad, they cant even be bothered to explain WTF they are pimping.

    1. Re:WTF Is Scrapy? by stummjr · · Score: 1

      Hey, I just added a comment explaining a little bit about Scrapy: http://developers.slashdot.org...

    2. Re:WTF Is Scrapy? by ChunderDownunder · · Score: 1

      it's New Editor Day, though I'm not sure yaelk is any improvement over timothy.

  3. Whoopie by Anonymous Coward · · Score: 0

    Maybe they should have been more careful with Python 3 so it isn't such a huge jump and stopped supporting the 2.x branch.

    The Ruby big BC break(1.8->1.9) wasn't a huge effort to port code over(none of my projects took more than 3-4 hours to fully port) and 1.8 has been dead for years and all of my Ruby projects have had no problems from 1.9.x to 2.3.0. Zero code breakage.

    Python will remain fractured until Guido pulls his head out of his ass, and given his really stupid design decisions over the years, that is unlikely to ever happen.

    1. Re:Whoopie by taxman_10m · · Score: 1

      I know none of this stuff in either in Ruby or Python. Just doing a quick google search it seems the Ruby equivalent to Scrapy is Nokogiri. Anyone aware of the trials and tribulations Nokogiri users had with Ruby versions as compared to Python users and their Scrapy?

    2. Re:Whoopie by Anonymous Coward · · Score: 0

      Nope

      Nokogiri is strictly an HTML/XML parser.

      It can be used in a web scraper but by itself it is not. At best, Nokigiri is a subset of Scrapy.

      It works in both the 1.9 and 2.x branches

    3. Re:Whoopie by taxman_10m · · Score: 1

      is there a direct equivalent to Scrapy?

    4. Re:Whoopie by Anonymous Coward · · Score: 0

      mechanize

    5. Re:Whoopie by h33t+l4x0r · · Score: 1

      No, ruby mechanize is analogous to python mechanize or perl mechanize (but somewhat better). Scrapy is some crappy scraping framework that's analogous to any other crappy scraping framework. import.io comes to mind, as does kimono.

  4. This should help: by Anonymous Coward · · Score: 2
    1. Re:This should help: by Anonymous Coward · · Score: 2, Informative

      Nope, that link just goes to a page that says:

      let methat for you [sic]

      [Google Search] [I'm Feeling Lucky]

      Enable javascript to use LMGTFY.

    2. Re:This should help: by Darinbob · · Score: 2

      By your logic, all the Slashdot editors have to do in their summaries is say "there's some news, head to a news site to find out what it is!" Having to google what a summary is talking about means that it is a bad summary.

    3. Re:This should help: by Anonymous Coward · · Score: 0

      Yup, that's how it is. I just went to the front page and looked at all the summaries. All but two are "... writes:" with nothing else.

      I think slashdot could keep all the submissions for a timeframe in a pool visible with a "+1" style button, and every two hours the submission in that bunch with the highest score gets posted. Same thing but automated!

  5. Thanks, Obama. by shess · · Score: 0

    This is the most amazing thing I have ever heard! Scrapy is ... a thing, which ... does ... stuff, and now I can use things which do stuff with Python ... 3 is it? I can hardly contain my joy.

    1. Re:Thanks, Obama. by taxman_10m · · Score: 1

      I also fail to see the big deal.

    2. Re:Thanks, Obama. by JustAnotherOldGuy · · Score: 1

      This is the most amazing thing I have ever heard! Scrapy is ... a thing, which ... does ... stuff,

      I know, isn't it incredible? It's the most amazing thing I've never heard of. I can't wait for another "article" about some "thing" that does "stuff".

      Thank you, Slashdot editors, for maintaining the high journalistic stands we've come to never, ever expect!

      --
      Just cruising through this digital world at 33 1/3 rpm...
    3. Re:Thanks, Obama. by stummjr · · Score: 1

      Hey, I wrote a couple of words about what Scrapy is in a comment: http://developers.slashdot.org...

    4. Re:Thanks, Obama. by arth1 · · Score: 1

      Who is this "yaelk" submitter anyhow? He or she makes an overworked timothy look like an outstanding editor in comparison, and that's quite an achievement.

    5. Re:Thanks, Obama. by JustAnotherOldGuy · · Score: 1

      He or she makes an overworked timothy look like an outstanding editor in comparison,

      A monkey with a severe head injury would look like an outstanding editor in comparison with timothy.

      --
      Just cruising through this digital world at 33 1/3 rpm...
    6. Re:Thanks, Obama. by plopez · · Score: 1

      That's worth at least $10 million in crowd funding!

      --
      putting the 'B' in LGBTQ+
    7. Re:Thanks, Obama. by plopez · · Score: 1

      Can yo condense that for me? Perhaps submit it to /.?

      --
      putting the 'B' in LGBTQ+
    8. Re:Thanks, Obama. by Anonymous Coward · · Score: 0

      A monkey with a severe head injury would look like an outstanding editor in comparison with timothy. by JustAnotherOldDOUCHEBLOWHARD (4145623) on Thursday February 04, 2016 @10:40PM (#51443875)

      Big pro editor JustAnotherOldDOUCHEBLOWHARD speaks his wisdom from years of professional experience as an editor for a highly successful publication yes? Not. He's being a bullshitting BLOWHARD fuckhead as usual with no brain, hahahaha! Why no brain? He was stupid enough to hire an epileptic surgeon operating with a chainsaw on his PUNY microcephalic skull as it to begin with - you see the results - do not trust his choices. The old douche is not a good judge of things seeing as he's deluded into thinking he has massive skills in writing and can judge others and his will is law (lol, not).

    9. Re: Thanks, Obama. by Anonymous Coward · · Score: 0

      Apk, is that you?

    10. Re: Thanks, Obama. by Anonymous Coward · · Score: 0

      Whoever it was that was funnier than shit. Epileptic surgeon operating with a chainsaw.

  6. This by OzPeter · · Score: 3

    This is the sort of thing I mark as Binspam in the Firehose. Blatant advertising.

    Unfortunately I was doing other things when it popped up.

    --
    I am Slashdot. Are you Slashdot as well?
    1. Re:This by Anonymous Coward · · Score: 0

      This is the sort of thing I mark as Binspam in the Firehose. Blatant advertising.

      Eh, I don't think I agree with you. Scrapy is a free software project.

      The story is on a site called practicalecommerce.com; if the story had been "come visit practicalecommerce.com! We have new stories! I can't believe #3!" then I would agree with you. But this is announcing that there is one fewer project that depends on Python 2.x, and I'm happy to hear it.

    2. Re:This by Anonymous Coward · · Score: 0

      This is news as much as a new major version of the Linux kernel is news. Just like all news it is very important to people that care and spam to people that don't understand.

    3. Re:This by OzPeter · · Score: 1

      Eh, I don't think I agree with you. Scrapy is a free software project.

      The story is on a site called practicalecommerce.com; if the story had been "come visit practicalecommerce.com! We have new stories! I can't believe #3!" then I would agree with you. But this is announcing that there is one fewer project that depends on Python 2.x, and I'm happy to hear it.

      Feel free to disagree with me, but I'll still Binspam it when I see it. Yes it is news, but it is boring as shit news that is only relevant to the few people who use the service. /. is all about the discussion yet there is nothing to discuss here. It is purely a product announcement for some minor system and nothing else. If it was some new revolutionary project that promised the world it would be a different thing. But it isn't. If it was some new announcement by tech visionary then it would generate some discussion. But it hasn't. This is the sort of thing that Freshmeat was meant for.

      If you want to improve the quality of stories on /. then you have too cull crap like this from the main story feed.

      --
      I am Slashdot. Are you Slashdot as well?
    4. Re:This by stummjr · · Score: 1

      Just to make it clear, that link there was not there in the original submission. I actually never heard about it. :)

    5. Re:This by stummjr · · Score: 0

      Also, I added a comment talking a little bit about Scrapy: http://developers.slashdot.org...

  7. Finally!! by Anonymous Coward · · Score: 0

    Awesome news! I can't wait to finally use Scrapy with Python 3. It's lack of Python 3 support was very important to me. I am so enthusiastic I can finally use this fine piece of software with a version of Python that's been out roughly a decade. Python 3 support really puts the 'py' back in Scrapy.

  8. Scrapy is a web spider by Art3x · · Score: 3, Informative

    From the summary:

    Scrapy with beta Python 3 support is finally here!

    Here's how I would write it:

    Scrapy, a web spider, now has beta support for Python 3.

    This is why I get paid the big bucks.

    1. Re:Scrapy is a web spider by plopez · · Score: 3, Funny

      "Scrapy, a web spider, now has beta support for Python 3."

      TL;DR ;)

      --
      putting the 'B' in LGBTQ+
  9. What by ArchieBunker · · Score: 1

    What the fuck is Scrapy? How do you even pronounce it? Scray-pee or Scrap-e?

    --
    Only the State obtains its revenue by coercion. - Murray Rothbard
    1. Re:What by plopez · · Score: 4, Funny

      The "s" is silent.

      --
      putting the 'B' in LGBTQ+
    2. Re:What by Anonymous Coward · · Score: 0

      Actually, both the "s" and the "c" are silent.

  10. Seriously, WTF is "Scrapy"?? by JustAnotherOldGuy · · Score: 1, Troll

    Seriously, would it be SO hard to include a couple of words about what "Scrapy" is? Just a couple of words?

    Fuck it. You know, I had high hopes that the new owners of slashdot might exercise just the tiniest fucking bit of editorial skill or acumen, but I must be one of those perpetual optimists, doomed to disappointment.

    Pro Tip to the "editors": When you write an article about something, it's considered good journalistic practice to explain potentially obscure references so people know what the fuck you're talking about.

    --
    Just cruising through this digital world at 33 1/3 rpm...
    1. Re:Seriously, WTF is "Scrapy"?? by stummjr · · Score: 1

      hey, I just included a comment with a few words about what scrapy is: http://developers.slashdot.org...

    2. Re:Seriously, WTF is "Scrapy"?? by Anonymous Coward · · Score: 0

      The big pro editor JustAnotherOldDOUCHE speaks his wisdom from years of professional experience as an editor for a highly successful publication yes? Not. He's being a bullshitting BLOWHARD fuckhead as usual with no brain, hahahaha!

    3. Re:Seriously, WTF is "Scrapy"?? by Anonymous Coward · · Score: 0

      Even your explaination doesn't really help get to the point.

      So it's a framework. Great. A framework is a non-thing. What's inside the empty hull? ...For web crawlers/scrapers. What's the motivation behind crawling/scraping? What would be looked for? Why is a web crawler "stuff that matters?"

    4. Re:Seriously, WTF is "Scrapy"?? by Anonymous Coward · · Score: 0

      Really? You are that fucking retarded that you don't know what you could use a web scraper for?

      Data mining? How about as part of a security audit?

      Dumbass

    5. Re:Seriously, WTF is "Scrapy"?? by JustAnotherOldGuy · · Score: 1

      speaks his wisdom from years of professional experience as an editor

      You're absolutely right, Anonymous Coward...I've done technical writing and editing for 25+ years, and I'd never publish a content-free "article" like the ones so frequently found here. So yes, I do have "years of professional experience as an editor", decades, actually.

      I've worked for Boeing, Microsoft, AT&T, Sprint, Fluke, and quite a few other companies in the past, and, unlike you, I know whereof I speak. Also unlike you, I don't hide behind an anonymous account when I voice an opinion. :)

      Now go back to your basement hovel and freshen up for your exciting shift at Burger king.

      --
      Just cruising through this digital world at 33 1/3 rpm...
    6. Re:Seriously, WTF is "Scrapy"?? by Anonymous Coward · · Score: 0

      Prove it douchebag. Prove it. You can't can you? Nope. Anyone can talk a big game. Quit spouting your fantasyland fables asswipe.

    7. Re:Seriously, WTF is "Scrapy"?? by Anonymous Coward · · Score: 0

      Hahahaha, no you hide behind your fantasyland fake name online because you have never done a damn thing loser. Justanotheroldliar I've never seen an article by anyone with that name. Doesn't sound familiar as a respected author or tech writer.

    8. Re:Seriously, WTF is "Scrapy"?? by epyT-R · · Score: 1

      Anonymity doesn't make or break arguments, nor does relative financial success.

    9. Re:Seriously, WTF is "Scrapy"?? by Anonymous Coward · · Score: 0

      Elementary school children write. Anyone can be a "writer" you pitiful moron. Backup your words. Prove you worked at those places liar.

    10. Re:Seriously, WTF is "Scrapy"?? by JustAnotherOldGuy · · Score: 1

      Anonymity doesn't make or break arguments, nor does relative financial success.

      Sometimes it does.

      If you say "You'll never make a dime selling widgets because no one wants widgets", and you subsequently make a million dollars selling widgets, then the financial success does seem to indicate that people do indeed want widgets, rendering the "no one wants widgets" argument wrong.

      And if you'll notice, AC made no argument, he simply made a bunch of ad hominem attacks and unfounded assertions. I didn't actually see an argument in his rant, perhaps you could point it out.

      --
      Just cruising through this digital world at 33 1/3 rpm...
    11. Re:Seriously, WTF is "Scrapy"?? by Anonymous Coward · · Score: 0

      JustAnotherOldBLOWHARDDouche's delusional fake name online gives him legitimacy as the master of all things mystical and holy knowledge! He is the ultimate judge and authority (lol, in his deluded dim brain at least).

    12. Re:Seriously, WTF is "Scrapy"?? by Anonymous Coward · · Score: 0

      You claim to have worked for Microsoft. Prove it. Yet another lie from JustAnotherOldBLOWHARDDoucheLIAR obviously.

  11. Additional information by stummjr · · Score: 5, Informative

    Sorry guys for the not so informative story up there. Well, Scrapy is an open source framework to build web crawlers in Python. The lack of Python 3 support was a huge blocker for people who wanted to use Scrapy, but didn't want to use Python 2.7 anymore. This is why this release is a milestone for Scrapy users. It's very popular between python developers working with web crawling/web scraping. You can check the project page at GitHub here: https://github.com/scrapy/scra...

    1. Re:Additional information by Anonymous Coward · · Score: 0

      Well the story should have mentioned what scrapy is and I happened to already know what scrapy is... But this is Slashdot so if it's not written in C, C++ or Perl they will just chew it to pieces with hate and modding down.

    2. Re:Additional information by h33t+l4x0r · · Score: 1

      It's very popular, but somehow took years to see Python 3 support? This is something that would take a single developer an afternoon to commit. I'm trying to decide if this speaks badly for the python community, or the scrapy community.

    3. Re:Additional information by h33t+l4x0r · · Score: 1

      Also, we all know that scrapers are a bunch of degenerate parasites (and yes I include myself in that group before the downvotes come in).

    4. Re:Additional information by BitZtream · · Score: 1

      And all 3 of you that use it or care already know about python3 support, and the entire rest of the Internet doesn't give a fuck.

      I know this because I have github projects that aren't working at all with more pull requests and forks than you have and I didn't do any spam advertising on slashdot for them.

      Slashdot is not a ad platform for your little pet project to reimplement (poorly) something that has been done properly at least 10 times already and done poorly like your hundreds of times.

      --
      Persistent Volume manager for Kubernetes - https://github.com/dwimsey/openshift-pvmanager
    5. Re:Additional information by Heart44 · · Score: 1

      This is a good back about web scraping and it includes material about scrapy:

      http://www.amazon.com/Web-Scraping-Python-Collecting-Modern/dp/1491910291/ref=sr_1_1?s=books&ie=UTF8&qid=1454730640&sr=1-1&refinements=p_27%3ARyan+Mitchell

  12. Scrappy Malkovitch by goombah99 · · Score: 0

    three problems scrappy solves:
    1. Scrappy Scrappy Scrappy? Scrappy, Scrappy Scrappy, Scrappy Scrappy.
    2. Scrappy Scrappy: Scrappy, Scrappy Scrappy!
    3. Scrappy, Scrappy Scrappy Scrappy. Scrappy Scrappy Scrappy, Scrappy Scrappy? Scrappy.
    4. ?????
    5. Profit!

      Scrappy Scrappy Scrappy Scrappy Scrappy Scrappy Scrappy. Scrappy Scrappy Scrappy Scrappy. Scrappy Scrappy Scrappy Scrappy Scrappy Scrappy Scrappy Scrappy? Scrappy Scrappy Scrappy Scrappy Scrappy Scrappy Scrappy Scrappy Scrappy Scrappy Scrappy Scrappy Scrappy Scrappy Scrappy Scrappy Scrappy Scrappy Scrappy Scrappy Scrappy Scrappy Scrappy Scrappy Scrappy Scrappy!!!

    Or watch being john malkovitch.

    --
    Some drink at the fountain of knowledge. Others just gargle.
  13. Cool! by Irate+Engineer · · Score: 1

    WTF is Scrapy?

    --

    Left MS Windows for Linux Mint and never looked back!

    Vote for Bernie in 2016!

  14. Thank God by Anonymous Coward · · Score: 2, Funny

    The only thing keeping me from using Scrapy were two things:

    1) Lack of Python 3 support
    2) Knowing what Scrapy is


    Now only 2) is the thing that is keeping me from using it!

  15. UGH by Anonymous Coward · · Score: 0, Interesting

    Python, the language for hipster morons.

    1. Re:UGH by Anonymous Coward · · Score: 0

      Didn't all the hipsters move to node.js?

    2. Re:UGH by plopez · · Score: 1

      all the cool kids use Ruby

      https://www.youtube.com/watch?...

      --
      putting the 'B' in LGBTQ+
  16. WTF is Scrappy? by Anonymous Coward · · Score: 0

    Seems too niche to deserve front page coverage for supporting Python 3

  17. Dear New Slashdot Ownership: by Irate+Engineer · · Score: 1

    Something you may appreciate while reading this thread is that Slashdot users are very, very intolerant of advertising motivated-clickbait like this. We're here for, what was the phrase?..."News for Nerds"; thought-provoking science and technology discussions. We're not here to give our eyeballs all day to every website that paid you a few bucks to steer clicks to their new shiny product, though we realize bills need to be paid.

    If you're looking for ad revenue, you need to be smarter and less obnoxious about it with this crowd. We have ad-blockers and hosts files and we know how to use them, and we will use them until you figure out a respectful, non-intrusive means to deliver it while presenting content that draws in the nerds. You can work with the community here to develop a reasonable arrangement, or you can work against us and we'll just block everything, pack up, and leave.

    Dice went hard for clickbait at the expense of good content, and many left. Learn from their mistakes, please!

    --

    Left MS Windows for Linux Mint and never looked back!

    Vote for Bernie in 2016!

  18. Is it web scale? by goombah99 · · Score: 1

    Mongo DB is webscale.

    --
    Some drink at the fountain of knowledge. Others just gargle.
  19. We expect better under new management by Anonymous Coward · · Score: 1

    But this is Slashdot so if it's not written in C, C++ or Perl they will just chew it to pieces with hate and modding down.

    You make it sound like the criticism is not warranted. It *is* warranted, the blame being split squarely between:

    • The submitter for submitting a badly non-descriptive summary to the detriment of an excellent project.
    • The editors for not rejecting TFS (for being non-descriptive) or improving it by doing a few minutes of research and adding a few extra words. That's what editors are for, after all, and it's a pretty simple job.

    As TFS failed abysmally on both scores, the negative reception by readers was highly deserved.

    It was also predictable, given recent changes. We've had enough of this crap under DICE's mismanagement, and it's been turning Slashdot into a laughing stock. We don't want this incompetence to continue.

  20. Scrappy? by Anonymous Coward · · Score: 0

    Too bad i still don't have a clue what scrappy is or why I should care.

  21. Re:WTF is Scrapy? The gostak distims the doshes! by Anonymous Coward · · Score: 0