The LibreCrawl Authentication API provides session-based authentication with a tier-based access control system. All API endpoints (except authentication endpoints) require a valid session.

Session-Based Authentication: LibreCrawl uses HTTP cookies for session management. After successful login, the server sets a session cookie that must be included in all subsequent requests.

Authentication Flow

  1. Register a new account with /api/register or use /api/guest-login for limited access
  2. Login with /api/login to create a session (returns session cookie)
  3. Include the session cookie in all subsequent API requests
  4. Use /api/user/info to check current session status
  5. Call /api/logout to end the session

Tier System

LibreCrawl uses a four-tier access control system that determines which API features and settings are available:

Tier Access Level Limits
Guest Limited crawling access 3 crawls per 24 hours (IP-based), read-only, no settings access
User Basic features Unlimited crawls, crawler/export/issue settings
Extra Advanced features All User features + JavaScript rendering, filters, custom CSS
Admin Full access All features including concurrency, memory limits, proxy settings

Local Mode: When running with --local or -l flag, all users automatically receive admin tier and no verification is required.

Endpoints

POST /api/register

Register a new user account. Creates an unverified user account with the "user" tier (requires admin verification in standard mode).

Request Body

Parameter Type Required Description
username string Yes Username (must be unique)
email string Yes Email address (must be unique)
password string Yes Password (minimum 8 characters)

Example Request

curl -X POST http://localhost:5000/api/register \
  -H "Content-Type: application/json" \
  -d '{
    "username": "john_doe",
    "email": "[email protected]",
    "password": "SecurePass123"
  }'

Success Response (200 OK)

{
  "success": true,
  "message": "Registration successful"
}

Error Response (400 Bad Request)

{
  "success": false,
  "error": "Username already exists"
}
POST /api/login

Authenticate a user and create a session. Returns a session cookie that must be included in all subsequent requests.

Request Body

Parameter Type Required Description
username string Yes Username or email
password string Yes User password

Example Request

curl -X POST http://localhost:5000/api/login \
  -H "Content-Type: application/json" \
  -c cookies.txt \
  -d '{
    "username": "john_doe",
    "password": "SecurePass123"
  }'

Success Response (200 OK)

{
  "success": true,
  "message": "Login successful"
}

# Response includes Set-Cookie header with session ID
Set-Cookie: session=eyJfcGVybWFuZW50Ijp0cn...; HttpOnly; Path=/

Error Responses

# Invalid credentials (401 Unauthorized)
{
  "success": false,
  "error": "Invalid username or password"
}

# Account not verified (403 Forbidden)
{
  "success": false,
  "error": "Account not verified. Please wait for admin approval."
}
POST /api/guest-login

Create a guest session with limited access. Guest users can perform up to 3 crawls per 24-hour period (tracked by IP address).

Request Body

No request body required.

Example Request

curl -X POST http://localhost:5000/api/guest-login \
  -c cookies.txt

Success Response (200 OK)

{
  "success": true,
  "message": "Guest login successful"
}

Error Response (429 Too Many Requests)

{
  "success": false,
  "error": "Guest crawl limit reached (3 per 24 hours)"
}
POST /api/logout

End the current user session and invalidate the session cookie.

Request Body

No request body required.

Headers

Must include valid session cookie.

Example Request

curl -X POST http://localhost:5000/api/logout \
  -b cookies.txt

Success Response (200 OK)

{
  "success": true,
  "message": "Logged out successfully"
}
GET /api/user/info

Get information about the currently authenticated user, including tier level and crawl limits.

Headers

Must include valid session cookie.

Example Request

curl http://localhost:5000/api/user/info \
  -b cookies.txt

Success Response (200 OK)

{
  "success": true,
  "user": {
    "id": 1,
    "username": "john_doe",
    "tier": "user",
    "crawls_today": 5,
    "crawls_remaining": null // null = unlimited
  }
}

# For guest users:
{
  "success": true,
  "user": {
    "id": null,
    "username": "guest",
    "tier": "guest",
    "crawls_today": 1,
    "crawls_remaining": 2
  }
}

Error Response (401 Unauthorized)

{
  "success": false,
  "error": "Authentication required"
}

Password Security

LibreCrawl implements secure password handling:

  • Hashing: Passwords are hashed using bcrypt with automatic salt generation
  • Minimum Length: Passwords must be at least 8 characters
  • Storage: Only hashed passwords are stored in the database
  • Verification: Password verification uses constant-time comparison to prevent timing attacks

Session Management

Sessions are managed using Flask's session system:

  • Cookie-Based: Session data stored in encrypted HTTP-only cookies
  • Automatic Expiration: Crawler instances expire after 1 hour of inactivity
  • Per-User Isolation: Each user gets their own crawler instance and settings
  • Secure Cookies: Set SESSION_COOKIE_SECURE=True in production for HTTPS-only cookies

IP-Based Rate Limiting

Guest users are rate-limited by IP address. The system checks the following headers in order to determine the client's IP:

  1. CF-Connecting-IP - Cloudflare original client IP
  2. X-Forwarded-For - Proxy/load balancer forwarded IP
  3. X-Real-IP - Nginx real IP
  4. REMOTE_ADDR - Direct connection IP

Development Tip: Run LibreCrawl with the --local flag during development to bypass authentication and automatically get admin tier access.

Error Handling

The Authentication API returns specific error messages for different failure scenarios:

Status Code Scenario Error Message
400 Missing fields "Missing required fields"
400 Password too short "Password must be at least 8 characters"
400 Username taken "Username already exists"
400 Email taken "Email already exists"
401 Invalid credentials "Invalid username or password"
401 No session "Authentication required"
403 Unverified account "Account not verified. Please wait for admin approval."
429 Guest limit exceeded "Guest crawl limit reached (3 per 24 hours)"

Next Steps