Mobile.de Car Data Scraper is a responsible and ethical data scraping project that retrieves car listing data from Mobile.de. This project enforces delays between requests to avoid overloading the website's servers.
The project is written in Java 19 and makes use of the following technologies:
- Spring Boot
- Maven
- Log4j2
- JUnit 5
- Docker and Kubernetes
These instructions will get you a copy of the project up and running on your local machine for development and testing purposes.
You will need to have the following software installed on your machine:
- Java 19
- Docker and Kubernetes (optional, only if you want to deploy the project in a containerized environment)
- Clone the repository
git clone https://github.com/robertciotoiu/mobile-de-scraper.git
-
Set a valid localPath to point to a location on your disk where the MongoDB will persist
-
Navigate to the project directory
cd mobile-de-car-data-collector/infrastructure
- Build docker & push images and deploy all pods to a K8s namespace
./deploy.sh
Docker images will be built and pushed to the local docker image repository. Then it will create a namespace named "rc"(can be changed) and the K8s resources. Each pod will automatically start to crawl, parse and save car data listings into a MongoDB that runs in its own pod but persists the data locally on the disk.
To run the JUnit tests, execute the following command in the project directory:
mvn test
If you want to deploy the project in a containerized environment, you can use Docker and Kubernetes.
- Java 19
- Spring Boot
- Maven
- Log4j2
- JUnit 5
- Docker and Kubernetes
Please read CONTRIBUTING.md for details on our code of conduct, and the process for submitting pull requests.
- Robert Ciotoiu - robertciotoiu
See also the list of contributors who participated in this project.
This project is licensed under the MIT License - see the LICENSE.md file for details.