Interactive substring matching

The Machine Learning (ML) backend is designed to enhance the efficiency of auto-labeling in Named Entity Recognition (NER) tasks. It achieves this by selecting a keyword and automatically matching the same keyword in the provided text.

This ML backend works with the default NER template from Label Studio. You can find this by selecting Label Studio’s pre-built NER template when configuring the labeling interface. It is available under Natural Language Processing > Named Entity Recognition.

Here is an example of a labeling configuration that can be used with this ML backend:

<View>
  <Labels name="label" toName="text">
    <Label value="ORG" background="orange" />
    <Label value="PER" background="lightgreen" />
    <Label value="LOC" background="lightblue" />
    <Label value="MISC" background="lightgray" />
  </Labels>
  <Text name="text" value="$text" />
</View>
  1. Start the Machine Learning backend on http://localhost:9090 with prebuilt image:
docker-compose up
  1. Validate that the backend is running
$ curl http://localhost:9090/
{"status":"UP"}
  1. Create a project in Label Studio. Then from the Model page in the project settings, connect the model. The default URL is http://localhost:9090.

Building from source (advanced)

To build the ML backend from source, you have to clone the repository and build the Docker image:

docker-compose build

Running without Docker (advanced)

To run the ML backend without Docker, you have to clone the repository and install all dependencies using pip:

python -m venv ml-backend
source ml-backend/bin/activate
pip install -r requirements.txt

Then you can start the ML backend:

label-studio-ml start ./interactive_substring_matching

Configuration

Parameters can be set in docker-compose.yml before running the container.

The following common parameters are available:

  • BASIC_AUTH_USER - Specify the basic auth user for the model server
  • BASIC_AUTH_PASS - Specify the basic auth password for the model server
  • LOG_LEVEL - Set the log level for the model server
  • WORKERS - Specify the number of workers for the model server
  • THREADS - Specify the number of threads for the model server

Customization

The ML backend can be customized by adding your own models and logic inside the ./interactive_substring_matching directory.