Slashdot Mirror


Python 3 Is Coming To Scrapy (scrapinghub.com)

New submitter Valdir Stumm Junior writes: Scrapy with beta Python 3 support is finally here! Released through Scrapy 1.1.0rc1, this is the result of several months of hard work on the part of the Scrapy community and Scrapinghub engineers.

This is a huge milestone for all you Scrapy users (and those who haven't used Scrapy due to the lack of Python 3). Scrapy veterans and new adopters will soon be able to move their entire stack to Python 3 once the release becomes stable. Keep in mind that since this a release candidate, it is not ready to be used in production.

51 of 87 comments (clear)

  1. What the fuck is Scrapy? by Anonymous Coward · · Score: 5, Insightful

    What the fuck is Scrapy?

    1. Re:What the fuck is Scrapy? by vux984 · · Score: 3, Funny

      What the fuck is Scrapy?

      Pretty sure it's that irritating dog they added to scooby doo to try to inject new life into it but which only made it worse.

    2. Re:What the fuck is Scrapy? by pem · · Score: 2

      It's the sheep equivalent of mad cow disease.

    3. Re:What the fuck is Scrapy? by SeaFox · · Score: 1

      I thought they were talking about a Debian or Ubuntu release but the alphabetizing of version numbers is off.

    4. Re:What the fuck is Scrapy? by Darinbob · · Score: 2

      I had scrapy once but the ointment cleared it right up.

    5. Re:What the fuck is Scrapy? by Darinbob · · Score: 1

      That was Scrappy. Scrappy, Scrappy Poo, what are you?

    6. Re:What the fuck is Scrapy? by Anonymous Coward · · Score: 5, Informative

      Scrapy (/skrepi/ SKRAY-pee) is a free and open source web crawling framework, written in Python. Originally designed for web scraping, it can also be used to extract data using APIs or as a general purpose web crawler.

    7. Re:What the fuck is Scrapy? by Applehu+Akbar · · Score: 1

      What the fuck is Scrapy?

      A rather ugly prion disease. It's like naming your new application Smegma or Pus.

    8. Re:What the fuck is Scrapy? by squiggleslash · · Score: 3, Funny

      It's an collaborative cloud-ready framework that leverages Python based open source technologies to extract data across multiple standards based web sites.

      (Wow, actually looking at their website, I got it right, and I made that up initially.)

      --
      You are not alone. This is not normal. None of this is normal.
    9. Re:What the fuck is Scrapy? by invictusvoyd · · Score: 1

      What the fuck is Scrapy?

      You are a geek . You read the word scrapy along with the word python. The word is strikingly similar to the word scrape. You asked the above question.

      You are not a geek

    10. Re:What the fuck is Scrapy? by Bitbeisser · · Score: 1

      Scrapy (/skrepi/ SKRAY-pee) is a free and open source web crawling framework, written in Python. Originally designed for web scraping, it can also be used to extract data using APIs or as a general purpose web crawler.

      And their tagline is "Scapy is crappy"?

    11. Re:What the fuck is Scrapy? by WallyL · · Score: 1

      Mad cow disease is for cows! You are all cows, moo?

  2. WTF Is Scrapy? by Anonymous Coward · · Score: 5, Insightful

    Jesus christ I hate submissions like this. Not only is it a blatent product-pimping ad, they cant even be bothered to explain WTF they are pimping.

    1. Re:WTF Is Scrapy? by stummjr · · Score: 1

      Hey, I just added a comment explaining a little bit about Scrapy: http://developers.slashdot.org...

    2. Re:WTF Is Scrapy? by ChunderDownunder · · Score: 1

      it's New Editor Day, though I'm not sure yaelk is any improvement over timothy.

  3. This should help: by Anonymous Coward · · Score: 2
    1. Re:This should help: by Anonymous Coward · · Score: 2, Informative

      Nope, that link just goes to a page that says:

      let methat for you [sic]

      [Google Search] [I'm Feeling Lucky]

      Enable javascript to use LMGTFY.

    2. Re:This should help: by Darinbob · · Score: 2

      By your logic, all the Slashdot editors have to do in their summaries is say "there's some news, head to a news site to find out what it is!" Having to google what a summary is talking about means that it is a bad summary.

  4. Re:Whoopie by taxman_10m · · Score: 1

    I know none of this stuff in either in Ruby or Python. Just doing a quick google search it seems the Ruby equivalent to Scrapy is Nokogiri. Anyone aware of the trials and tribulations Nokogiri users had with Ruby versions as compared to Python users and their Scrapy?

  5. Re:Thanks, Obama. by taxman_10m · · Score: 1

    I also fail to see the big deal.

  6. This by OzPeter · · Score: 3

    This is the sort of thing I mark as Binspam in the Firehose. Blatant advertising.

    Unfortunately I was doing other things when it popped up.

    --
    I am Slashdot. Are you Slashdot as well?
    1. Re:This by OzPeter · · Score: 1

      Eh, I don't think I agree with you. Scrapy is a free software project.

      The story is on a site called practicalecommerce.com; if the story had been "come visit practicalecommerce.com! We have new stories! I can't believe #3!" then I would agree with you. But this is announcing that there is one fewer project that depends on Python 2.x, and I'm happy to hear it.

      Feel free to disagree with me, but I'll still Binspam it when I see it. Yes it is news, but it is boring as shit news that is only relevant to the few people who use the service. /. is all about the discussion yet there is nothing to discuss here. It is purely a product announcement for some minor system and nothing else. If it was some new revolutionary project that promised the world it would be a different thing. But it isn't. If it was some new announcement by tech visionary then it would generate some discussion. But it hasn't. This is the sort of thing that Freshmeat was meant for.

      If you want to improve the quality of stories on /. then you have too cull crap like this from the main story feed.

      --
      I am Slashdot. Are you Slashdot as well?
    2. Re:This by stummjr · · Score: 1

      Just to make it clear, that link there was not there in the original submission. I actually never heard about it. :)

  7. Scrapy is a web spider by Art3x · · Score: 3, Informative

    From the summary:

    Scrapy with beta Python 3 support is finally here!

    Here's how I would write it:

    Scrapy, a web spider, now has beta support for Python 3.

    This is why I get paid the big bucks.

    1. Re:Scrapy is a web spider by plopez · · Score: 3, Funny

      "Scrapy, a web spider, now has beta support for Python 3."

      TL;DR ;)

      --
      putting the 'B' in LGBTQ+
  8. What by ArchieBunker · · Score: 1

    What the fuck is Scrapy? How do you even pronounce it? Scray-pee or Scrap-e?

    --
    Only the State obtains its revenue by coercion. - Murray Rothbard
    1. Re:What by plopez · · Score: 4, Funny

      The "s" is silent.

      --
      putting the 'B' in LGBTQ+
  9. Seriously, WTF is "Scrapy"?? by JustAnotherOldGuy · · Score: 1, Troll

    Seriously, would it be SO hard to include a couple of words about what "Scrapy" is? Just a couple of words?

    Fuck it. You know, I had high hopes that the new owners of slashdot might exercise just the tiniest fucking bit of editorial skill or acumen, but I must be one of those perpetual optimists, doomed to disappointment.

    Pro Tip to the "editors": When you write an article about something, it's considered good journalistic practice to explain potentially obscure references so people know what the fuck you're talking about.

    --
    Just cruising through this digital world at 33 1/3 rpm...
    1. Re:Seriously, WTF is "Scrapy"?? by stummjr · · Score: 1

      hey, I just included a comment with a few words about what scrapy is: http://developers.slashdot.org...

    2. Re:Seriously, WTF is "Scrapy"?? by JustAnotherOldGuy · · Score: 1

      speaks his wisdom from years of professional experience as an editor

      You're absolutely right, Anonymous Coward...I've done technical writing and editing for 25+ years, and I'd never publish a content-free "article" like the ones so frequently found here. So yes, I do have "years of professional experience as an editor", decades, actually.

      I've worked for Boeing, Microsoft, AT&T, Sprint, Fluke, and quite a few other companies in the past, and, unlike you, I know whereof I speak. Also unlike you, I don't hide behind an anonymous account when I voice an opinion. :)

      Now go back to your basement hovel and freshen up for your exciting shift at Burger king.

      --
      Just cruising through this digital world at 33 1/3 rpm...
    3. Re:Seriously, WTF is "Scrapy"?? by epyT-R · · Score: 1

      Anonymity doesn't make or break arguments, nor does relative financial success.

    4. Re:Seriously, WTF is "Scrapy"?? by JustAnotherOldGuy · · Score: 1

      Anonymity doesn't make or break arguments, nor does relative financial success.

      Sometimes it does.

      If you say "You'll never make a dime selling widgets because no one wants widgets", and you subsequently make a million dollars selling widgets, then the financial success does seem to indicate that people do indeed want widgets, rendering the "no one wants widgets" argument wrong.

      And if you'll notice, AC made no argument, he simply made a bunch of ad hominem attacks and unfounded assertions. I didn't actually see an argument in his rant, perhaps you could point it out.

      --
      Just cruising through this digital world at 33 1/3 rpm...
  10. Additional information by stummjr · · Score: 5, Informative

    Sorry guys for the not so informative story up there. Well, Scrapy is an open source framework to build web crawlers in Python. The lack of Python 3 support was a huge blocker for people who wanted to use Scrapy, but didn't want to use Python 2.7 anymore. This is why this release is a milestone for Scrapy users. It's very popular between python developers working with web crawling/web scraping. You can check the project page at GitHub here: https://github.com/scrapy/scra...

    1. Re:Additional information by h33t+l4x0r · · Score: 1

      It's very popular, but somehow took years to see Python 3 support? This is something that would take a single developer an afternoon to commit. I'm trying to decide if this speaks badly for the python community, or the scrapy community.

    2. Re:Additional information by h33t+l4x0r · · Score: 1

      Also, we all know that scrapers are a bunch of degenerate parasites (and yes I include myself in that group before the downvotes come in).

    3. Re:Additional information by BitZtream · · Score: 1

      And all 3 of you that use it or care already know about python3 support, and the entire rest of the Internet doesn't give a fuck.

      I know this because I have github projects that aren't working at all with more pull requests and forks than you have and I didn't do any spam advertising on slashdot for them.

      Slashdot is not a ad platform for your little pet project to reimplement (poorly) something that has been done properly at least 10 times already and done poorly like your hundreds of times.

      --
      Persistent Volume manager for Kubernetes - https://github.com/dwimsey/openshift-pvmanager
    4. Re:Additional information by Heart44 · · Score: 1

      This is a good back about web scraping and it includes material about scrapy:

      http://www.amazon.com/Web-Scraping-Python-Collecting-Modern/dp/1491910291/ref=sr_1_1?s=books&ie=UTF8&qid=1454730640&sr=1-1&refinements=p_27%3ARyan+Mitchell

  11. Re:Thanks, Obama. by JustAnotherOldGuy · · Score: 1

    This is the most amazing thing I have ever heard! Scrapy is ... a thing, which ... does ... stuff,

    I know, isn't it incredible? It's the most amazing thing I've never heard of. I can't wait for another "article" about some "thing" that does "stuff".

    Thank you, Slashdot editors, for maintaining the high journalistic stands we've come to never, ever expect!

    --
    Just cruising through this digital world at 33 1/3 rpm...
  12. Cool! by Irate+Engineer · · Score: 1

    WTF is Scrapy?

    --

    Left MS Windows for Linux Mint and never looked back!

    Vote for Bernie in 2016!

  13. Re:Thanks, Obama. by stummjr · · Score: 1

    Hey, I wrote a couple of words about what Scrapy is in a comment: http://developers.slashdot.org...

  14. Thank God by Anonymous Coward · · Score: 2, Funny

    The only thing keeping me from using Scrapy were two things:

    1) Lack of Python 3 support
    2) Knowing what Scrapy is


    Now only 2) is the thing that is keeping me from using it!

  15. Re:Thanks, Obama. by arth1 · · Score: 1

    Who is this "yaelk" submitter anyhow? He or she makes an overworked timothy look like an outstanding editor in comparison, and that's quite an achievement.

  16. Re:Whoopie by taxman_10m · · Score: 1

    is there a direct equivalent to Scrapy?

  17. Re:Thanks, Obama. by JustAnotherOldGuy · · Score: 1

    He or she makes an overworked timothy look like an outstanding editor in comparison,

    A monkey with a severe head injury would look like an outstanding editor in comparison with timothy.

    --
    Just cruising through this digital world at 33 1/3 rpm...
  18. Re:Thanks, Obama. by plopez · · Score: 1

    That's worth at least $10 million in crowd funding!

    --
    putting the 'B' in LGBTQ+
  19. Re:Thanks, Obama. by plopez · · Score: 1

    Can yo condense that for me? Perhaps submit it to /.?

    --
    putting the 'B' in LGBTQ+
  20. Re:UGH by plopez · · Score: 1

    all the cool kids use Ruby

    https://www.youtube.com/watch?...

    --
    putting the 'B' in LGBTQ+
  21. Dear New Slashdot Ownership: by Irate+Engineer · · Score: 1

    Something you may appreciate while reading this thread is that Slashdot users are very, very intolerant of advertising motivated-clickbait like this. We're here for, what was the phrase?..."News for Nerds"; thought-provoking science and technology discussions. We're not here to give our eyeballs all day to every website that paid you a few bucks to steer clicks to their new shiny product, though we realize bills need to be paid.

    If you're looking for ad revenue, you need to be smarter and less obnoxious about it with this crowd. We have ad-blockers and hosts files and we know how to use them, and we will use them until you figure out a respectful, non-intrusive means to deliver it while presenting content that draws in the nerds. You can work with the community here to develop a reasonable arrangement, or you can work against us and we'll just block everything, pack up, and leave.

    Dice went hard for clickbait at the expense of good content, and many left. Learn from their mistakes, please!

    --

    Left MS Windows for Linux Mint and never looked back!

    Vote for Bernie in 2016!

  22. Is it web scale? by goombah99 · · Score: 1

    Mongo DB is webscale.

    --
    Some drink at the fountain of knowledge. Others just gargle.
  23. We expect better under new management by Anonymous Coward · · Score: 1

    But this is Slashdot so if it's not written in C, C++ or Perl they will just chew it to pieces with hate and modding down.

    You make it sound like the criticism is not warranted. It *is* warranted, the blame being split squarely between:

    • The submitter for submitting a badly non-descriptive summary to the detriment of an excellent project.
    • The editors for not rejecting TFS (for being non-descriptive) or improving it by doing a few minutes of research and adding a few extra words. That's what editors are for, after all, and it's a pretty simple job.

    As TFS failed abysmally on both scores, the negative reception by readers was highly deserved.

    It was also predictable, given recent changes. We've had enough of this crap under DICE's mismanagement, and it's been turning Slashdot into a laughing stock. We don't want this incompetence to continue.

  24. Re:Whoopie by h33t+l4x0r · · Score: 1

    No, ruby mechanize is analogous to python mechanize or perl mechanize (but somewhat better). Scrapy is some crappy scraping framework that's analogous to any other crappy scraping framework. import.io comes to mind, as does kimono.