Troubleshoot machine learning
After you set up machine learning with Label Studio or create your own machine learning backend to use with Label Studio, you can troubleshoot any issues you encounter by reviewing the possible causes on this page.
You can investigate most problems using the server console log. The machine learning backend runs as a separate server from Label Studio, so make sure you check the correct server console logs while troubleshooting. To see more detailed logs, start the ML backend server with the
If you’re running an ML backend:
- Production training logs are located in
- Production runtime logs are located in
In development mode, training logs appear in the web browser console.
If you’re running an ML backend using Docker Compose:
- Training logs are located in
- Main process and inference logs are located in
Label studio has default timeouts for all types of requests to ML server.
Label studio has several different requests to ML server:
- Health - request to check ML backend health status when adding new ML backend (env variable ML_TIMEOUT_HEALTH)
- Setup - request to setup ML backend, initialize ML model (env variable ML_TIMEOUT_SETUP)
- Predict - prediction request when Label Studio gets predictions from ML backend (env variable ML_TIMEOUT_PREDICT)
- Train - request to train ML backend (env variable ML_TIMEOUT_PREDICT)
- Duplicate model - duplicate model request to ML backend (env variable ML_TIMEOUT_PREDICT)
- Delete - send delete request to ML backend (env variable ML_TIMEOUT_PREDICT)
- Train job status - request train job status from ML backend (env variable ML_TIMEOUT_PREDICT)
You can adjust the timeout by setting an environment variables for each request or modify in Label Studio variables. These are the variables section in Label Studio (in seconds):
CONNECTION_TIMEOUT = float(get_env('ML_CONNECTION_TIMEOUT', 1)) TIMEOUT_DEFAULT = float(get_env('ML_TIMEOUT_DEFAULT', 100)) TIMEOUT_TRAIN = float(get_env('ML_TIMEOUT_TRAIN', 30)) TIMEOUT_PREDICT = float(get_env('ML_TIMEOUT_PREDICT', 100)) TIMEOUT_HEALTH = float(get_env('ML_TIMEOUT_HEALTH', 1)) TIMEOUT_SETUP = float(get_env('ML_TIMEOUT_SETUP', 3)) TIMEOUT_DUPLICATE_MODEL = float(get_env('ML_TIMEOUT_DUPLICATE_MODEL', 1)) TIMEOUT_DELETE = float(get_env('ML_TIMEOUT_DELETE', 1)) TIMEOUT_TRAIN_JOB_STATUS = float(get_env('ML_TIMEOUT_TRAIN_JOB_STATUS', 1))
You can modify them in ml/api_connector.py.
Your ML backend server might not have started properly.
- Check whether the ML backend server is running. Run the following health check:
curl -X GET http://localhost:9090/health
- If the health check doesn’t respond, or you see errors, check the server logs.
- If you used Docker Compose to start the ML backend, check for requirements missing from the
requirements.txtfile used to set up the environment inside Docker.
The ML backend seems to be connected, but after I click “Start Training”, I see “Error. Click here for details.” message
Click the error message to review the traceback. Common errors that you might see include:
- Insufficient number of annotations completed for training to begin.
- Memory issues on the server.
If you can’t resolve the traceback issues by yourself, contact us on Slack.
Your ML backend might be producing predictions in the wrong format.
- Check to see whether the ML backend predictions format follows the same structure as predictions in imported pre-annotations.
- Confirm that your project’s label configuration matches the output produced by your ML backend. For example, use the Choices tag to create a class of predictions for text. See more Label Studio tags.
If you see errors about missing packages in the terminal after starting your ML backend server, or in the logs, you might need to specify additional packages in the
requirements.txt file for your ML backend.
Because the ML backend and Label Studio are different services, the assets (images, audio, etc.) that you label must be hosted and be accessible with URLs by the machine learning backend, otherwise it might fail to create predictions.
If you get a validation error when adding the ML backend URL to your Label Studio project, check the following:
- Is the labeling interface set up with a valid configuration?
- Is the machine learning backend running? Run the following health check:
curl -X GET http://localhost:9090/health
- Is your machine learning backend available from your Label Studio instance? It must be available to the instance running Label Studio.
If you’re running Label Studio in Docker, you must run the machine learning backend inside the same Docker container, or otherwise make it available to the Docker container running Label Studio. You can use the
docker exec command to run commands inside the Docker container, or use
docker exec -it <container_id> /bin/sh to start a shell in the context of the container. See the docker exec documentation.
Default timeouts for all types of requests to ML server in SaaS Cload (in seconds):
TIMEOUT_DEFAULT = 100 TIMEOUT_TRAIN = 30 TIMEOUT_PREDICT = 100 TIMEOUT_HEALTH = 1 TIMEOUT_SETUP = 3 TIMEOUT_DUPLICATE_MODEL = 1 TIMEOUT_DELETE = 1 TIMEOUT_TRAIN_JOB_STATUS = 1