[go: up one dir, main page]

Skip to content

Version 1.0 of the CrowdTruth Framework for crowdsourcing ground truth data, for training and evaluation of cognitive computing systems. Check out also version 2.0 at https://github.com/CrowdTruth/CrowdTruth-core. Data collected with CrowdTruth methodology: http://data.crowdtruth.org/. Our papers: http://crowdtruth.org/papers/

Notifications You must be signed in to change notification settings

CrowdTruth/CrowdTruth

Repository files navigation

Notice: This repository is no longer being maintained. Please see CrowdTruth-core for the latest release of the framework.

Latest Stable Version Build Status Code Coverage Scrutinizer Code Quality

The CrowdTruth Framework implements an approach to machine-human computing for collecting annotation data on text, images, sounds and videos. The approach is focussed specifically on collecting gold standard data for training and evaluation of cognitive computing systems. The original framework was inspired by the IBM Watson project for providing improved (multi-perspective) gold standard (medical) text annotation data for the training and evaluation of various IBM Watson components, such as Medical Relation Extraction, Medical Factor Extraction and Question-Answer passage alignment.

The CrowdTruth framework supports the composition of CrowdTruth gathering workflows, where a sequence of micro-annotation tasks can be configured and sent out to a number of crowdsourcing platforms (e.g. CrowdFlower and Amazon Mechanical Turk) and applications (e.g. Expert annotation game Dr. Detective). The CrowdTruth framework has a special focus on micro-tasks for knowledge extraction in medical text (e.g. medical documents, from various sources such as Wikipedia articles or patient case reports). The main steps involved in the CrowdTruth workflow are:

  1. Exploring & processing of input data
  2. Collecting of annotation data
  3. Applying disagreement analytics on the results

These steps are realised in an automatic end-to-end workflow, that can support a continuous collection of high quality gold standard data with feedback loop to all steps of the process. Have a look at our presentations and papers for more details on the research.

Using CrowdTruth

Start using CrowdTruth right now, completely free, and explore all its possiblities. Follow the installation guide to get started, or check out our wiki for all documentation of the platform. We have some crowdsourcing templates ready for you to start with.