[go: up one dir, main page]

Skip to content

✨All the goodies you'll ever need to scrape the web (NodeJs / Browser)

License

Notifications You must be signed in to change notification settings

JimmyLaurent/lycos

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🐾 lycos.js

All the goodies you'll ever need to scrape the web

Documentation

In-browser Playground

You can try the library on codesandbox, it uses a cors proxy fetcher to let you grab contents from any website inside your browser.

Installation

yarn add lycos
# or
npm i lycos

Features

  • ⚡️️ All in one package to fetch and scrape data from the web
  • ⭐ Node & Browser Support
  • 💡 Powerful declarative API
  • 🚀 Blazingly fast (supports concurrency)
  • 🔧 Extensible

Quick Example

const lycos = require('lycos');

(async () => {
// Fetch the given url and return a page scraper
const page = await lycos.get('http://quotes.toscrape.com');

// Scrape all the quotes elements
const quoteElements = page.scrapeAll('.quote');

// For each quote element, scrape the text and the author 
const quotes = quoteElements.map(element => ({
    text: element.scrape('.text').text(),
    author: element.scrape('.author').text()
}));

// Shortcut to scrape the collection of quotes
const quotes = page.scrapeAll('.quote', {
  author: '.author@text',
  text: '.text@text'
});

// Shortcut to fetch and scrape
const quotes = await lycos
  .get('http://quotes.toscrape.com')
  .scrapeAll('.quote', {
    author: '.author@text',
    text: '.text@text'
  });

})();

Credits

• FB55: his work reprensents the core of this library.

• Matt Mueller and cheerio contributors : A good portion of the code and concepts are copied/derived from the cheerio and x-ray libraries.

License

MIT © 2019 Jimmy Laurent

About

✨All the goodies you'll ever need to scrape the web (NodeJs / Browser)

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published