[go: up one dir, main page]

Page MenuHomePhabricator

Develop a python script to fetch and process Quote of the day
Open, Needs TriagePublic

Description

Wikiquote features quote of the day, showcasing some of the best quotes from the project. However, there is not structured way to get this information directly from an API.

The idea is create Python (preferred) script that will fetch the information from https://en.wikiquote.org/wiki/Wikiquote:Quote_of_the_day which has structure, and then create a parser to extract the following following for each quote,

  • Quote
  • Quoted by (author)
  • Feature date
  • Unique index for each quote and author combination

The final output will be Python script that will run daily, and update the database with the latest information. Additionally, there will also be API which can be accessed by re-uses which serves them with the required information.

This task has an immediate partnership need where an external partner would like feature Wikiquotes in their products. cc @PDas

Note: this is a proposed task for WTS 2024 mini-hackathon, we'll see what is achievable during the event.

Event Timeline

Thank you for tagging this task with good first task for Wikimedia newcomers!

Newcomers often may not be aware of things that may seem obvious to seasoned contributors, so please take a moment to reflect on how this task might look to somebody who has never contributed to Wikimedia projects.

A good first task is a self-contained, non-controversial task with a clear approach. It should be well-described with pointers to help a completely new contributor, for example it should clearly pointed to the codebase URL and provide clear steps to help a contributor get setup for success. We've included some guidelines at https://phabricator.wikimedia.org/tag/good_first_task/ !

Thank you for helping us drive new contributions to our projects <3

I would Like to work on it
need repo!

I am interested in working on this and similar projects. I am an Information Science professional with 17 years of experience and extensive knowledge of unique information resources and their sources.

import requests
from bs4 import BeautifulSoup
import sqlite3
from datetime import datetime

Set up SQLite database

conn = sqlite3.connect('wikiquote.db')
c = conn.cursor()

Create table if not exists

c.execute('''CREATE TABLE IF NOT EXISTS quotes

(id INTEGER PRIMARY KEY, quote TEXT, author TEXT, date TEXT, unique_id TEXT)''')

Fetch and parse Wikiquote's Quote of the Day

def fetch_quote_of_the_day():

url = "https://en.wikiquote.org/wiki/Wikiquote:Quote_of_the_day"
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')

# Find the latest quote and author (assuming consistent structure)
quote_section = soup.find('div', {'id': 'mp-qotd'})
quote = quote_section.find('div', {'class': 'qotd-text'}).text.strip()
author = quote_section.find('div', {'class': 'qotd-author'}).text.strip()

# Today's date for the feature date
feature_date = datetime.today().strftime('%Y-%m-%d')

# Create a unique ID based on quote and author
unique_id = f"{hash(quote)}_{hash(author)}"

return {'quote': quote, 'author': author, 'date': feature_date, 'unique_id': unique_id}

Insert quote into the database

def store_quote(quote_data):

c.execute("INSERT INTO quotes (quote, author, date, unique_id) VALUES (?, ?, ?, ?)", 
          (quote_data['quote'], quote_data['author'], quote_data['date'], quote_data['unique_id']))
conn.commit()

Run the function

quote_data = fetch_quote_of_the_day()
store_quote(quote_data)
conn.close()

from flask import Flask, jsonify
import sqlite3

app = Flask(name)

Fetch quote from the database

def get_latest_quote():

conn = sqlite3.connect('wikiquote.db')
c = conn.cursor()
c.execute("SELECT * FROM quotes ORDER BY date DESC LIMIT 1")
result = c.fetchone()
conn.close()
return {'quote': result[1], 'author': result[2], 'date': result[3], 'unique_id': result[4]}

API route

@app.route('/api/quote', methods=['GET'])
def quote_api():

return jsonify(get_latest_quote())

if name == 'main':

app.run(debug=True)

i want to work on this issue; please assign me