The LibreCrawl Authentication API provides session-based authentication with a tier-based access control system. All API endpoints (except authentication endpoints) require a valid session.
Session-Based Authentication: LibreCrawl uses HTTP cookies for session management. After successful login, the server sets a session cookie that must be included in all subsequent requests.
Authentication Flow
- Register a new account with
/api/registeror use/api/guest-loginfor limited access - Login with
/api/loginto create a session (returns session cookie) - Include the session cookie in all subsequent API requests
- Use
/api/user/infoto check current session status - Call
/api/logoutto end the session
Tier System
LibreCrawl uses a four-tier access control system that determines which API features and settings are available:
| Tier | Access Level | Limits |
|---|---|---|
| Guest | Limited crawling access | 3 crawls per 24 hours (IP-based), read-only, no settings access |
| User | Basic features | Unlimited crawls, crawler/export/issue settings |
| Extra | Advanced features | All User features + JavaScript rendering, filters, custom CSS |
| Admin | Full access | All features including concurrency, memory limits, proxy settings |
Local Mode: When running with --local or -l flag, all users automatically receive admin tier and no verification is required.
Endpoints
Register a new user account. Creates an unverified user account with the "user" tier (requires admin verification in standard mode).
Request Body
| Parameter | Type | Required | Description |
|---|---|---|---|
| username | string | Yes | Username (must be unique) |
| string | Yes | Email address (must be unique) | |
| password | string | Yes | Password (minimum 8 characters) |
Example Request
curl -X POST http://localhost:5000/api/register \
-H "Content-Type: application/json" \
-d '{
"username": "john_doe",
"email": "[email protected]",
"password": "SecurePass123"
}'
Success Response (200 OK)
{
"success": true,
"message": "Registration successful"
}
Error Response (400 Bad Request)
{
"success": false,
"error": "Username already exists"
}
Authenticate a user and create a session. Returns a session cookie that must be included in all subsequent requests.
Request Body
| Parameter | Type | Required | Description |
|---|---|---|---|
| username | string | Yes | Username or email |
| password | string | Yes | User password |
Example Request
curl -X POST http://localhost:5000/api/login \
-H "Content-Type: application/json" \
-c cookies.txt \
-d '{
"username": "john_doe",
"password": "SecurePass123"
}'
Success Response (200 OK)
{
"success": true,
"message": "Login successful"
}
# Response includes Set-Cookie header with session ID
Set-Cookie: session=eyJfcGVybWFuZW50Ijp0cn...; HttpOnly; Path=/
Error Responses
# Invalid credentials (401 Unauthorized)
{
"success": false,
"error": "Invalid username or password"
}
# Account not verified (403 Forbidden)
{
"success": false,
"error": "Account not verified. Please wait for admin approval."
}
Create a guest session with limited access. Guest users can perform up to 3 crawls per 24-hour period (tracked by IP address).
Request Body
No request body required.
Example Request
curl -X POST http://localhost:5000/api/guest-login \
-c cookies.txt
Success Response (200 OK)
{
"success": true,
"message": "Guest login successful"
}
Error Response (429 Too Many Requests)
{
"success": false,
"error": "Guest crawl limit reached (3 per 24 hours)"
}
End the current user session and invalidate the session cookie.
Request Body
No request body required.
Headers
Must include valid session cookie.
Example Request
curl -X POST http://localhost:5000/api/logout \
-b cookies.txt
Success Response (200 OK)
{
"success": true,
"message": "Logged out successfully"
}
Get information about the currently authenticated user, including tier level and crawl limits.
Headers
Must include valid session cookie.
Example Request
curl http://localhost:5000/api/user/info \
-b cookies.txt
Success Response (200 OK)
{
"success": true,
"user": {
"id": 1,
"username": "john_doe",
"tier": "user",
"crawls_today": 5,
"crawls_remaining": null // null = unlimited
}
}
# For guest users:
{
"success": true,
"user": {
"id": null,
"username": "guest",
"tier": "guest",
"crawls_today": 1,
"crawls_remaining": 2
}
}
Error Response (401 Unauthorized)
{
"success": false,
"error": "Authentication required"
}
Password Security
LibreCrawl implements secure password handling:
- Hashing: Passwords are hashed using bcrypt with automatic salt generation
- Minimum Length: Passwords must be at least 8 characters
- Storage: Only hashed passwords are stored in the database
- Verification: Password verification uses constant-time comparison to prevent timing attacks
Session Management
Sessions are managed using Flask's session system:
- Cookie-Based: Session data stored in encrypted HTTP-only cookies
- Automatic Expiration: Crawler instances expire after 1 hour of inactivity
- Per-User Isolation: Each user gets their own crawler instance and settings
- Secure Cookies: Set
SESSION_COOKIE_SECURE=Truein production for HTTPS-only cookies
IP-Based Rate Limiting
Guest users are rate-limited by IP address. The system checks the following headers in order to determine the client's IP:
CF-Connecting-IP- Cloudflare original client IPX-Forwarded-For- Proxy/load balancer forwarded IPX-Real-IP- Nginx real IPREMOTE_ADDR- Direct connection IP
Development Tip: Run LibreCrawl with the --local flag during development to bypass authentication and automatically get admin tier access.
Error Handling
The Authentication API returns specific error messages for different failure scenarios:
| Status Code | Scenario | Error Message |
|---|---|---|
| 400 | Missing fields | "Missing required fields" |
| 400 | Password too short | "Password must be at least 8 characters" |
| 400 | Username taken | "Username already exists" |
| 400 | Email taken | "Email already exists" |
| 401 | Invalid credentials | "Invalid username or password" |
| 401 | No session | "Authentication required" |
| 403 | Unverified account | "Account not verified. Please wait for admin approval." |
| 429 | Guest limit exceeded | "Guest crawl limit reached (3 per 24 hours)" |
Next Steps
- Crawl Control API - Start and manage crawls
- Status API - Get real-time crawl data
- Getting Started Guide - Build your first integration