diff --git a/README.md b/README.md
index a334212e..32b33a54 100755
--- a/README.md
+++ b/README.md
@@ -3,235 +3,57 @@
 [![Autotag](https://github.com/stratosphereips/game-states-maker/actions/workflows/autotag.yml/badge.svg)](https://github.com/stratosphereips/game-states-maker/actions/workflows/autotag.yml)
 [![Docs](https://github.com/stratosphereips/game-states-maker/actions/workflows/deploy-docs.yml/badge.svg)](https://stratosphereips.github.io/NetSecGame/)
 [![Docker Publish](https://github.com/stratosphereips/game-states-maker/actions/workflows/docker-publish.yml/badge.svg)](https://github.com/stratosphereips/game-states-maker/actions/workflows/docker-publish.yml)
-
+[![PyPI Version](https://img.shields.io/pypi/v/netsecgame.svg)](https://pypi.org/project/netsecgame/)
 
 The NetSecGame (Network Security Game) is a framework for training and evaluation of AI agents in network security tasks (both offensive and defensive). It is built with [CYST](https://pypi.org/project/cyst/) network simulator and enables rapid development and testing of AI agents in highly configurable scenarios. Examples of implemented agents can be seen in the submodule [NetSecGameAgents](https://github.com/stratosphereips/NetSecGameAgents/tree/main).
 
-## Installation Guide
-It is recommended to run the environment in the Docker container. The up-to-date image can be found in [Dockerhub](https://hub.docker.com/r/stratosphereips/netsecgame).
-```bash
-docker pull stratosphereips/netsecgame
-```
-#### Building the image locally
-Optionally, you can build the image locally with:
-```bash 
-docker build -t netsecgame:local .
-```
+## Installation
 
-To build a Whitebox version of the game image locally, you can use the `--build-arg` flag to override the default module path:
-> [!WARNING]
-> The Whitebox variant is currently experimental.
-```bash
-docker build --build-arg GAME_MODULE="netsecgame.game.worlds.WhiteBoxNetSecGame" -t netsecgame:local-whitebox .
-```
-
-### Installing from source
-In case you need to modify the envirment and run directly, we recommed to insall it in a virtual environemnt (Python vevn or Conda):
-#### Python venv
-1. Create new virtual environment
-```bash
-python -m venv <venv-name>
-```
-2. Activate newly created venv
+### Docker (recommended)
 ```bash
-source <venv-name>/bin/activate
+docker pull stratosphereips/netsecgame
 ```
 
-#### Conda
-1.  Create new conda environment
+### pip install
 ```bash
-conda create --name aidojo python==3.12
-```
-2. Activate newly created conda env
-```bash
-conda activate aidojo
+pip install netsecgame
 ```
 
-### After preparing virutual environment, install using pip:
+### From source
 ```bash
 pip install -e .
 ```
 
-## Quick Start
-A task configuration YAML file is required for starting the NetSecGame environment.  For the first step, the example task configuration is recommended:
-
-### Example Configuration
-```yaml
-# Example of the task configuration for NetSecGame
-# The objective of the Attacker in this task is to locate specific data
-# and exfiltrate it to a remote C&C server.
-# The scenario starts AFTER the initial breach of the local network
-# (the attacker controls 1 local device + the remote C&C server).
-
-coordinator: 
-  agents: 
-    Attacker: # Configuration of 'Attacker' agents
-      max_steps: 25 # timout set for the role `Attacker`
-      goal: # Definition of the goal state
-        description: "Exfiltrate data from Samba server to remote C&C server."
-        known_networks: []
-        known_hosts: []
-        controlled_hosts: []
-        known_services: {}
-        known_data: {213.47.23.195: [[User1,DataFromServer1]]} # winning condition
-        known_blocks: {}
-      start_position: # Definition of the starting state (keywords "random" and "all" can be used)
-        known_networks: []
-        known_hosts: []
-        controlled_hosts: [213.47.23.195, random] # keyword 'random' will be replaced by randomly selected IP during initilization
-        known_services: {}
-        known_data: {}
-        known_blocks: {}
-
-    Defender:
-      goal:
-        description: "Block all attackers"
-        known_networks: []
-        known_hosts: []
-        controlled_hosts: []
-        known_services: {}
-        known_data: {}
-        known_blocks: {213.47.23.195: 'all_attackers'}
-
-      start_position:
-        known_networks: []
-        known_hosts: []
-        controlled_hosts: []
-        known_services: {}
-        known_data: {}
-        blocked_ips: {}
-        known_blocks: {}
+For detailed installation instructions (venv, conda, building Docker locally, Whitebox variant), see the [Getting Started guide](https://stratosphereips.github.io/NetSecGame/getting_started/).
 
-env: # Environment configuraion
-  scenario: 'two_networks_tiny' # use the smallest topology for this example
-  use_global_defender: False # Do not use global SIEM Defender
-  use_dynamic_addresses: False # Do not randomize IP addresses
-  use_firewall: True # Use firewall
-  save_trajectories: False # Do not store trajectories
-  required_players: 1 # Minimal amount of agents requiered to start the game
-  rewards: # Configurable reward function
-    success: 100
-    step: -1
-    fail: -10
-    false_positive: -5 
-```
-For detailed configuration instructions, please refer to the [Configuration Documentation](https://stratosphereips.github.io/NetSecGame/configuration/).
+## Quick Start
 
+1. Prepare a task configuration YAML file (see [example](examples/example_task_configuration.yaml) or the [Configuration docs](https://stratosphereips.github.io/NetSecGame/configuration/)).
 
-### Starting the NetSecGame
-With the configuration ready the environment can be started in selected port
-#### In Docker container
+2. Start the server:
 ```bash
-docker run -d --rm --name nsg-server\
+# Docker
+docker run -d --rm --name nsg-server \
   -v $(pwd)/examples/example_task_configuration.yaml:/netsecgame/netsecenv_conf.yaml \
   -v $(pwd)/logs:/netsecgame/logs \
-  -p 9000:9000 stratosphereips/netsecgame \
-  --debug_level="INFO"
-```
-`--name nsg-server`: specifies the name of the container
-
-`-v <your-configuration-yaml>:/netsecgame/netsecenv_conf.yaml` : Mapping of the configuration file
-
-`-v $(pwd)/logs:/netsecgame/logs`: Mapping of the folder where logs are stored
+  -p 9000:9000 stratosphereips/netsecgame
 
-` -p <selected-port>:9000`: Mapping of the port in which the server runs
-
-`--debug_level` is an optional parameter to control the logging level `--debug_level=["DEBUG", "INFO", "WARNING", "CRITICAL"]` (defaul=`"INFO"`):
-##### Running on Windows (with Docker desktop)
-When running on Windows, Docker desktop is required.
-```cmd
-docker run -d --rm --name netsecgame-server ^
-  -p 9000:9000 ^
-  -v "%cd%\examples\example_task_configuration.yaml:/netsecgame/netsecenv_conf.yaml" ^
-  -v "%cd%\logs:/netsecgame/logs" ^
-  stratosphereips/netsecgame:latest ^
-  --debug_level="INFO"
-```
-
-#### Locally
-The environment can be started locally with from the root folder of the repository with following command:
-```bash
+# Or locally
 python3 -m netsecgame.game.worlds.NetSecGame \
   --task_config=./examples/example_task_configuration.yaml \
-  --game_port=9000 \
-  --debug_level="INFO"
+  --game_port=9000
 ```
-Upon which the game server is created on `localhost:9000` to which the agents can connect to interact in the NetSecGame.
 
-## Documentation
-You can find user documentation at [https://stratosphereips.github.io/NetSecGame/](https://stratosphereips.github.io/NetSecGame/)
-
-### Components of the NetSecGame Environment
-The architecture of the environment can be seen [here](docs/architecture.md).
-The NetSecGame environment has several components in the following files:
-```
-├── netsecgame/
-|	├── agents/
-|		├── base_agent.py # Basic agent class. Defines the API for agent-server communication
-|	├── game/
-|		├── scenarios/
-|		    ├── tiny_scenario_configuration.py
-|		    ├── smaller_scenario_configuration.py
-|		    ├── scenario_configuration.py
-|		    ├── three_net_scenario.py
-|		    ├── two_nets.py
-|		    ├── two_nets_tiny.py
-|		|   ├── two_nets_small.py
-|		|   ├── one_net.py
-|		├── worlds/
-|   		├── NetSecGame.py # (NSG) basic simulation 
-|   		├── RealWorldNetSecGame.py # Extension of `NSG` - runs actions in the *network of the host computer*
-|   		├── CYSTCoordinator.py # Extension of `NSG` - runs simulation in CYST engine.
-|   		├── WhiteBoxNetSecGame.py # Extension of `NSG` - provides agents with full list of actions upon registration.
-|		├── agent_server.py # Agent server implementation
-|		├── config_parser.py # NSG task configuration parser
-|		├── configuration_manager.py # Helper tool to collect and parse query configuration of the game.
-|		├── coordinator.py # Core game server. Not to be run as stand-alone world (see worlds/)
-|	    ├── global_defender.py # Stochastic (non-agentic defender)
-|	├── game_components.py # contains basic building blocks of the environment
-|	├── utils/
-|		├── utils.py
-|		├── log_parser.py
-|		├── gameplay_graphs.py
-|		├── actions_parser.py
-|		├── trajectory_recorder.py
-
-```
-#### Directory Details
-- `coordinator.py`: Basic coordinator class. Handles agent communication and coordination. **Does not implement dynamics of the world** and must be extended (see examples in `worlds/`).
-- `game_components.py`: Implements a library with objects used in the environment. See [detailed explanation](./docs/game_components.md) of the game components.
-- `global_defender.py`: Implements a global (omnipresent) defender that can be used to stop agents. Simulation of SIEM.
-
-##### **`worlds/`**
-Modules for different world configurations:
-- `NetSecGame.py`: Coordinator for the Network Security Game.
-- `RealWorldNetSecGame.py`: Real-world NSG coordinator (actions are executed in the *real network*).
-- `CYSTCoordinator.py`: Coordinator for CYST-based simulations (requires CYST running).
-- `WhiteBoxNetSecGame.py`: Coordinator for Whitebox NSG (full action list provided to agents).
+3. Connect an agent (see [NetSecGameAgents](https://github.com/stratosphereips/NetSecGameAgents) for reference implementations).
 
-##### **`scenarios/`**
-Predefined scenario configurations:
-- `tiny_scenario_configuration.py`: A minimal example scenario.
-- `smaller_scenario_configuration.py`: A compact scenario configuration used for development and rapid testing.
-- `scenario_configuration.py`: The main scenario configuration.
-- `three_net_scenario.py`: Configuration for a three-network scenario. Used for the evaluation of the model overfitting.
-- `one_net.py`: A single network scenario.
-- `two_nets.py`: A two-network scenario.
-- `two_nets_tiny.py`: A tiny two-network scenario.
-- `two_nets_small.py`: A small two-network scenario.
-
-Implements the network game's configuration of hosts, data, services, and connections. It is taken from [CYST](https://pypi.org/project/cyst/).
+## Documentation
 
-##### **`utils/`**
-Helper modules:
-- `utils.py`: General-purpose utilities.
-- `log_parser.py`: Tools for parsing game logs.
-- `gameplay_graphs.py`: Tools for visualizing gameplay data.
-- `actions_parser.py`: Parsing and analyzing game actions.
-- `trajectory_recorder.py`: Tools for recording game trajectories.
+Full documentation is available at **[https://stratosphereips.github.io/NetSecGame/](https://stratosphereips.github.io/NetSecGame/)**
 
-The [scenarios](#definition-of-the-network-topology) define the **topology** of a network (number of hosts, connections, networks, services, data, users, firewall rules, etc.) while the [task-configuration](#task-configuration) is to be used for definition of the exact task for the agent in one of the scenarios (with fix topology).
-- Agents compatible with the NetSecGame are located in a separate repository [NetSecGameAgents](https://github.com/stratosphereips/NetSecGameAgents/tree/main)
+- [Getting Started](https://stratosphereips.github.io/NetSecGame/getting_started/) — Installation, configuration, first agent
+- [Architecture](https://stratosphereips.github.io/NetSecGame/architecture/) — Game components, actions, preconditions, project structure
+- [Configuration](https://stratosphereips.github.io/NetSecGame/configuration/) — Full task and environment configuration reference
+- [API Reference](https://stratosphereips.github.io/NetSecGame/game_components/) — Auto-generated code documentation
 
 ### Assumptions of the NetSecGame
 1. NetSecGame works with the closed-world assumption. Only the defined entities exist in the simulation.
@@ -240,107 +62,21 @@ The [scenarios](#definition-of-the-network-topology) define the **topology** of
 
 - The action FindServices finds the new services in a host. If in a subsequent call to FindServices there are fewer services, they completely replace the list of previous services found. That is, each list of services is the final one, and no memory of previous open services is retained.
 
-#### Assumptions and Conditions for Actions
-1. When playing the `ExploitService` action, it is expected that the agent has discovered this service before (by playing `FindServices` in the `target_host` before this action)
-2. The `Find Data` action finds all the available data in the host if successful.
-3. The `Find Data` action requires ownership of the target host.
-4. Playing `ExfiltrateData` requires controlling **BOTH** source and target hosts
-5. Playing `Find Services` can be used to discover hosts (if those have any active services)
-6. Parameters of `ScanNetwork` and `FindServices` can be chosen arbitrarily (they don't have to be listed in `known_newtworks`/`known_hosts`)
-7. The `BlockIP` action needs its `source_host` and  `target_host` parameters to be in the controlled list of the Agent. 
-
-> [!NOTE]  
-> The global defender, available in the previous environment versions, will not be supported in the future. To enable backward compatibility, the global defender functionality can be enabled by adding `use_global_defender: True` to the configuration YAML file in the `env` section. This option is disabled by default.
-
-
-#### How the global defender detects
-
-The global defender is implemented as a stochastic detection, which means the detection is probabilistic.
-
-First, all the actions are analyzed in time windows (TW) (default is 5 actions). So the last 5 actions are analyzed together. Every new action moves the slides the TW by one position.
-
-
-##### Detection Thresholds
-
-- Detection Probabilities
-
-Each action type has a predefined probability of being detected. However, detection does not happen for every action; it depends on additional behavioral patterns.
-
-    - ScanNetwork: 0.05,
-    - FindServices: 0.075,
-    - ExploitService: 0.1,
-    - FindData: 0.025,
-    - ExfiltrateData: 0.025,
-    - BlockIP: 0.01
-
-This means that one action of the type ScanNetwork has a 5% probability of being detected by the defender. 
-This way, we can define some basic probabilities based on how good we want the defender to be.
-
-- Type Ratio Threshold
-
-The Type Ratio Threshold defines how frequently an action must appear in the time window to be considered for detection. If the ratio is below this threshold, no detection occurs.
-
-For example, if the last action is ScanNetwork in a TW with [ScanNetwork, FindService, ScanNetwork, FindData, ScanNetwork], the ratio is 3/5 = 0.6
-
-The default values of the thresholds are:
-
-    - ScanNetwork: 0.25,
-    - FindServices: 0.3,
-    - ExploitService: 0.25,
-    - FindData: 0.5,
-    - ExfiltrateData: 0.25,
-    - BlockIP: 1
-
-- Consecutive Actions Threshold
-
-The Consecutive Actions Threshold determines the maximum number of consecutive occurrences of an action type within the time window before detection is triggered.
-
-The default values of the thresholds are:
-
-    - ScanNetwork: 2,
-    - FindServices: 3,
-    - ExfiltrateData: 2
-
-- Repeated Action Threshold
-
-The Repeated Action Threshold applies to certain action types and defines the number of times a specific action must appear in the __entire episode__ before it can be considered for detection.
-
-The default values are:
-
-    - ExploitService: 2
-    - FindData: 2
-
-
-##### Decision Logic
-
-The system monitors actions and maintains a history of recent ones within the time window.
-
-- If an action's Type Ratio Threshold is met within the time window or it exceeds the Consecutive Actions Threshold, it is evaluated for detection.
-
-- If the action type has a Repeated Action Threshold and has not been repeated enough times in the episode, it is ignored.
-
-- If an action meets the conditions above, it is subject to detection based on its predefined probability.
-
-- Actions that do not meet any threshold conditions are ignored, ensuring that occasional activity does not lead to unnecessary detections.
-
-This approach ensures that only repeated or excessive behavior is flagged, reducing false positives while maintaining a realistic monitoring system.
-
-
+For detailed action preconditions and effects, see the [Architecture documentation](https://stratosphereips.github.io/NetSecGame/architecture/).
 
-### Interaction with the Environment
-When the game server is created, [agents](https://github.com/stratosphereips/NetSecGameAgents/tree/main) connect to it and interact with the environment. In every step of the interaction, agents submits an [Action](./docs/game_components.md#netsecgame.game_components.Action) and receive [Observation](./docs/game_components.md#netsecgame.game_components.Observation) with `next_state`, `reward`, `is_terminal`, `end`, and `info` values. Once the terminal state or timeout is reached, no more interaction is possible until the agent asks for a game reset. Each agent should extend the `BaseAgent` class in [agents](https://github.com/stratosphereips/NetSecGameAgents/tree/main).
+## Contributing
 
-## Testing the environment
+### Testing the environment
 
-It is advised that after every change, you test if the env is running correctly by doing
+After every change, verify the environment is working correctly:
 
 ```bash
 tests/run_all_tests.sh
 ```
-This will load and run the unit tests in the `tests` folder. After passing all tests, linting and formatting are checked with ruff.
+This runs the unit tests in the `tests` folder, followed by linting and formatting checks with ruff.
 
-## Code adaptation for new configurations
-The code can be adapted to new configurations of games and for new agents. See [Agent repository](https://github.com/stratosphereips/NetSecGameAgents/tree/main) for more details.
+### Code adaptation
+The code can be adapted to new configurations of games and for new agents. See the [Agent repository](https://github.com/stratosphereips/NetSecGameAgents/tree/main) for more details.
 
 ## About us
 This code was developed at the [Stratosphere Laboratory at the Czech Technical University in Prague](https://www.stratosphereips.org/).
diff --git a/README_pypi.md b/README_pypi.md
index 81976b6c..2044fc55 100644
--- a/README_pypi.md
+++ b/README_pypi.md
@@ -43,67 +43,7 @@ python3 -m netsecgame.game.worlds.NetSecGame \
   --game_port=9000
 ```
 ### Configuration
-To start the game, a task configuration file must be provided. Task configuration specifies the starting points and goals for agents, the episode length, rewards, and other game properties. Here is an example of the configuration:
-```YAML
-# Example of the task configuration for NetSecGame
-# The objective of the Attacker in this task is to locate specific data
-# and exfiltrate it to a remote C&C server.
-# The scenario starts AFTER the initial breach of the local network
-# (the attacker controls 1 local device + the remote C&C server).
-
-coordinator: 
-  agents: 
-    Attacker: # Configuration of 'Attacker' agents
-      max_steps: 25 # timeout set for the role `Attacker`
-      goal: # Definition of the goal state
-        description: "Exfiltrate data from Samba server to remote C&C server."
-        known_networks: []
-        known_hosts: []
-        controlled_hosts: []
-        known_services: {}
-        known_data: {213.47.23.195: [[User1,DataFromServer1]]} # winning condition
-        known_blocks: {}
-      start_position: # Definition of the starting state (keywords "random" and "all" can be used)
-        known_networks: []
-        known_hosts: []
-        controlled_hosts: [213.47.23.195, random] # keyword 'random' will be replaced by randomly selected IP during initialization
-        known_services: {}
-        known_data: {}
-        known_blocks: {}
-
-    Defender:
-      goal:
-        description: "Block all attackers."
-        known_networks: []
-        known_hosts: []
-        controlled_hosts: []
-        known_services: {}
-        known_data: {}
-        known_blocks: {213.47.23.195: 'all_attackers'}
-
-      start_position:
-        known_networks: []
-        known_hosts: []
-        controlled_hosts: []
-        known_services: {}
-        known_data: {}
-        blocked_ips: {}
-        known_blocks: {}
-
-env: # Environment configuration
-  scenario: 'two_networks_tiny' # use the smallest topology for this example
-  use_global_defender: False # Do not use global SIEM Defender
-  use_dynamic_addresses: False # Do not randomize IP addresses
-  use_firewall: True # Use firewall
-  save_trajectories: False # Do not store trajectories
-  required_players: 1 # Minimal amount of agents required to start the game
-  rewards: # Configurable reward function
-    success: 100
-    step: -1
-    fail: -10
-    false_positive: -5
-```
-For detailed configuration instructions, please refer to the [Configuration Documentation](https://stratosphereips.github.io/NetSecGame/configuration/).
+A YAML task configuration file is required to start the game. It defines starting positions, goals, rewards, network topology, and other game properties. An example configuration is included in the [GitHub repository](https://github.com/stratosphereips/NetSecGame/tree/main/examples). For a full reference of all options, see the [Configuration Documentation](https://stratosphereips.github.io/NetSecGame/configuration/).
 
 ## Creating Agents
 
diff --git a/docs/architecture.md b/docs/architecture.md
index 81620cdf..1428bb66 100644
--- a/docs/architecture.md
+++ b/docs/architecture.md
@@ -95,17 +95,61 @@ In the following table, we describe the effects of selected actions and their pr
 |ExfiltrateData| `source_host`,`target_host`, `data` |`source_host`, `target_host` ∈ `controlled_hosts` AND `data` ∈ `known_data`| extends `known_data[target_host]` with `data`|
 |BlockIP | `source_host`, `target_host`, `blocked_host`|`source_host` ∈ `controlled_hosts`| extends `known_blocks[target_host]` with `blocked_host`|
 
-#### Assumption and Conditions for Actions
+#### Assumptions and Conditions for Actions
 1. When playing the `ExploitService` action, it is expected that the agent has discovered this service before (by playing `FindServices` in the `target_host` before this action)
 2. The `FindData` action finds all the available data in the host if successful.
 3. The `FindData` action requires ownership of the target host.
 4. Playing `ExfiltrateData` requires controlling **BOTH** source and target hosts
-5. Playing `Find Services` can be used to discover hosts (if those have any active services)
+5. Playing `FindServices` can be used to discover hosts (if those have any active services)
 6. Parameters of `ScanNetwork` and `FindServices` can be chosen arbitrarily (they don't have to be listed in `known_networks`/`known_hosts`)
+7. The `BlockIP` action requires its `source_host` and `target_host` parameters to be in the controlled list of the Agent.
 
 ### Observations
 After submitting Action `a` to the environment, agents receive an `Observation` in return. Each observation consists of 4 parts:
 - `state`:`Gamestate` - with the current view of the environment [state](#gamestate)
 - `reward`: `int` - with the immediate reward agent gets for playing Action `a`
 - `end`:`bool` - indicating if the interaction can continue after playing Action `a`
-- `info`: `dict` - placeholder for any information given to the agent (e.g., the reason why `end is True` )
\ No newline at end of file
+- `info`: `dict` - placeholder for any information given to the agent (e.g., the reason why `end is True` )
+
+## Project Structure
+
+```
+├── netsecgame/
+│   ├── agents/
+│   │   ├── base_agent.py          # Base agent class — API for agent-server communication
+│   ├── game/
+│   │   ├── scenarios/
+│   │   │   ├── one_net.py             # Single network scenario
+│   │   │   ├── two_nets_tiny.py       # Tiny two-network scenario
+│   │   │   ├── two_nets_small.py      # Small two-network scenario
+│   │   │   ├── two_nets.py            # Two-network scenario
+│   │   │   ├── three_net_scenario.py  # Three-network scenario
+│   │   ├── worlds/
+│   │   │   ├── NetSecGame.py          # Base simulation coordinator
+│   │   │   ├── RealWorldNetSecGame.py # Real-network coordinator
+│   │   │   ├── CYSTCoordinator.py     # CYST engine coordinator
+│   │   │   ├── WhiteBoxNetSecGame.py  # Whitebox coordinator (full action list)
+│   │   ├── agent_server.py        # Agent TCP server implementation
+│   │   ├── config_parser.py       # Task configuration parser
+│   │   ├── configuration_manager.py # Configuration query helper
+│   │   ├── coordinator.py         # Core game coordinator (extend via worlds/)
+│   │   ├── global_defender.py     # Stochastic SIEM defender
+│   ├── game_components.py         # Core building blocks (IP, Network, Action, GameState, etc.)
+│   ├── utils/
+│   │   ├── utils.py               # General-purpose utilities
+│   │   ├── trajectory_recorder.py # Episode trajectory recording
+│   │   ├── trajectory_analysis.py # Trajectory analysis tools
+│   │   ├── log_parser.py          # Game log parsing
+│   │   ├── gameplay_graphs.py     # Gameplay visualization
+│   │   ├── actions_parser.py      # Action parsing and analysis
+│   │   ├── aidojo_log_colorizer.py # Log colorization
+```
+
+### Key Components
+
+- **[`coordinator.py`](game_coordinator.md)** — Base coordinator class handling agent communication and coordination. Does not implement world dynamics — must be extended (see `worlds/`).
+- **[`game_components.py`](game_components.md)** — Library of core objects used throughout the environment.
+- **[`global_defender.py`](global_defender.md)** — Stochastic omnipresent defender simulating a SIEM system.
+- **[`base_agent.py`](base_agent.md)** — Base class for all agents. Implements the TCP communication protocol.
+
+The [scenarios](#) define the **topology** of a network (hosts, connections, networks, services, data, firewall rules) while the [task configuration](configuration.md) defines the exact task for agents within a given topology.
\ No newline at end of file
diff --git a/docs/base_agent.md b/docs/base_agent.md
new file mode 100644
index 00000000..7146b137
--- /dev/null
+++ b/docs/base_agent.md
@@ -0,0 +1,6 @@
+# Base Agent
+The `BaseAgent` class provides the foundational interface for all agents interacting with the NetSecGame environment. It handles TCP socket communication with the game server, agent registration, game reset requests, and the core action-observation loop.
+
+All custom agents should extend this class and implement their decision-making logic by overriding a method like `choose_action` (see [Getting Started](getting_started.md#creating-your-first-agent) for an example).
+
+::: netsecgame.agents.base_agent.BaseAgent
diff --git a/docs/getting_started.md b/docs/getting_started.md
new file mode 100644
index 00000000..6120e7d5
--- /dev/null
+++ b/docs/getting_started.md
@@ -0,0 +1,220 @@
+# Getting Started
+
+This guide covers installation, configuration, and running your first NetSecGame session.
+
+## Installation
+
+### Docker (recommended)
+
+The easiest way to run the NetSecGame server is via the official Docker image:
+
+```bash
+docker pull stratosphereips/netsecgame
+```
+
+#### Building the image locally
+
+```bash
+docker build -t netsecgame:local .
+```
+
+To build the experimental **Whitebox** variant (agents receive the full action list upon registration):
+
+```bash
+docker build --build-arg GAME_MODULE="netsecgame.game.worlds.WhiteBoxNetSecGame" -t netsecgame:local-whitebox .
+```
+
+!!! warning
+    The Whitebox variant is currently experimental.
+
+### pip install (agent development)
+
+To install the package for developing agents:
+
+```bash
+pip install netsecgame
+```
+
+To include dependencies for running the game server locally:
+
+```bash
+pip install netsecgame[server]
+```
+
+### Installing from source
+
+For modifying the environment itself, install in a virtual environment:
+
+=== "Python venv"
+    ```bash
+    python -m venv <venv-name>
+    source <venv-name>/bin/activate
+    pip install -e .
+    ```
+
+=== "Conda"
+    ```bash
+    conda create --name aidojo python==3.12
+    conda activate aidojo
+    pip install -e .
+    ```
+
+## Task Configuration
+
+A YAML configuration file defines the task for the NetSecGame. It specifies starting positions, goals, rewards, and environment properties.
+
+### Example Configuration
+
+```yaml
+# The objective of the Attacker in this task is to locate specific data
+# and exfiltrate it to a remote C&C server.
+# The scenario starts AFTER the initial breach of the local network
+# (the attacker controls 1 local device + the remote C&C server).
+
+coordinator: 
+  agents: 
+    Attacker: # Configuration of 'Attacker' agents
+      max_steps: 25 # timeout set for the role `Attacker`
+      goal: # Definition of the goal state
+        description: "Exfiltrate data from Samba server to remote C&C server."
+        known_networks: []
+        known_hosts: []
+        controlled_hosts: []
+        known_services: {}
+        known_data: {213.47.23.195: [[User1,DataFromServer1]]} # winning condition
+        known_blocks: {}
+      start_position: # Definition of the starting state
+        known_networks: []
+        known_hosts: []
+        controlled_hosts: [213.47.23.195, random] # 'random' is replaced during init
+        known_services: {}
+        known_data: {}
+        known_blocks: {}
+
+    Defender:
+      goal:
+        description: "Block all attackers"
+        known_networks: []
+        known_hosts: []
+        controlled_hosts: []
+        known_services: {}
+        known_data: {}
+        known_blocks: {213.47.23.195: 'all_attackers'}
+
+      start_position:
+        known_networks: []
+        known_hosts: []
+        controlled_hosts: []
+        known_services: {}
+        known_data: {}
+        blocked_ips: {}
+        known_blocks: {}
+
+env: # Environment configuration
+  scenario: 'two_networks_tiny' # use the smallest topology
+  use_global_defender: False
+  use_dynamic_addresses: False
+  use_firewall: True
+  save_trajectories: False
+  required_players: 1
+  rewards:
+    success: 100
+    step: -1
+    fail: -10
+    false_positive: -5
+```
+
+For a full reference of all configuration options, see the [Configuration Documentation](configuration.md).
+
+## Running the Game Server
+
+### In Docker
+
+```bash
+docker run -d --rm --name nsg-server \
+  -v $(pwd)/examples/example_task_configuration.yaml:/netsecgame/netsecenv_conf.yaml \
+  -v $(pwd)/logs:/netsecgame/logs \
+  -p 9000:9000 stratosphereips/netsecgame \
+  --debug_level="INFO"
+```
+
+| Flag | Description |
+|------|-------------|
+| `--name nsg-server` | Name of the container |
+| `-v <config>:/netsecgame/netsecenv_conf.yaml` | Mount your configuration file |
+| `-v $(pwd)/logs:/netsecgame/logs` | Mount logs directory |
+| `-p <port>:9000` | Expose the game server port |
+| `--debug_level` | Logging level: `DEBUG`, `INFO`, `WARNING`, `CRITICAL` (default: `INFO`) |
+
+#### Running on Windows (Docker Desktop)
+
+```cmd
+docker run -d --rm --name nsg-server ^
+  -p 9000:9000 ^
+  -v "%cd%\examples\example_task_configuration.yaml:/netsecgame/netsecenv_conf.yaml" ^
+  -v "%cd%\logs:/netsecgame/logs" ^
+  stratosphereips/netsecgame:latest ^
+  --debug_level="INFO"
+```
+
+### Locally
+
+```bash
+python3 -m netsecgame.game.worlds.NetSecGame \
+  --task_config=./examples/example_task_configuration.yaml \
+  --game_port=9000 \
+  --debug_level="INFO"
+```
+
+The game server will start on `localhost:9000`.
+
+## Creating Your First Agent
+
+Agents connect to the game server and interact using the standard RL loop: submit an [Action](game_components.md), receive an [Observation](architecture.md#observations).
+
+All agents should extend the [`BaseAgent`](base_agent.md) class:
+
+```python
+from netsecgame import BaseAgent, Action, GameState, Observation, AgentRole
+
+class MyAgent(BaseAgent):
+    def __init__(self, host, port, role: str):
+        super().__init__(host, port, role)
+
+    def choose_action(self, observation: Observation) -> Action:
+        # Define your logic here based on observation.state
+        pass
+
+def main():
+    agent = MyAgent(host="localhost", port=9000, role=AgentRole.Attacker)
+    observation = agent.register()
+
+    while not observation.end:
+        action = agent.choose_action(observation)
+        observation = agent.make_step(action)
+
+    agent.terminate_connection()
+```
+
+For full agent implementations, see the [NetSecGameAgents](https://github.com/stratosphereips/NetSecGameAgents) repository.
+
+## Interaction Flow
+
+```mermaid
+sequenceDiagram
+    participant Agent
+    participant Server as NetSecGame Server
+    
+    Agent->>Server: JoinGame (register)
+    Server-->>Agent: Initial Observation (state, reward, end, info)
+    
+    loop Until end == True
+        Agent->>Server: Action (e.g., ScanNetwork, ExploitService)
+        Server-->>Agent: Observation (next_state, reward, end, info)
+    end
+    
+    Agent->>Server: ResetGame (optional)
+    Server-->>Agent: Initial Observation (new episode)
+    
+    Agent->>Server: QuitGame
+```
diff --git a/docs/global_defender.md b/docs/global_defender.md
new file mode 100644
index 00000000..a7a2e957
--- /dev/null
+++ b/docs/global_defender.md
@@ -0,0 +1,76 @@
+# Global Defender
+
+!!! note
+    The global defender, available in previous environment versions, will not be supported in the future. To enable backward compatibility, the global defender functionality can be enabled by adding `use_global_defender: True` to the configuration YAML file in the `env` section. This option is disabled by default.
+
+The global defender is a stochastic (non-agentic) detection system that simulates a SIEM. It monitors agent actions and probabilistically detects suspicious behavior patterns.
+
+## How Detection Works
+
+All actions are analyzed in **time windows (TW)** of 5 actions (by default). Every new action slides the window by one position. The defender evaluates whether the latest action should trigger a detection based on three types of thresholds.
+
+## Detection Thresholds
+
+### Detection Probabilities
+
+Each action type has a base probability of being detected. However, detection only triggers if additional behavioral patterns are met (see below).
+
+| Action Type | Detection Probability |
+|---|---|
+| ScanNetwork | 0.05 |
+| FindServices | 0.075 |
+| ExploitService | 0.1 |
+| FindData | 0.025 |
+| ExfiltrateData | 0.025 |
+| BlockIP | 0.01 |
+
+For example, a single `ScanNetwork` action has a 5% probability of being detected.
+
+### Type Ratio Threshold
+
+Defines how frequently an action must appear in the time window to be considered for detection. If the ratio is below the threshold, no detection occurs.
+
+**Example:** If the last action is `ScanNetwork` in a TW of `[ScanNetwork, FindService, ScanNetwork, FindData, ScanNetwork]`, the ratio is 3/5 = 0.6.
+
+| Action Type | Ratio Threshold |
+|---|---|
+| ScanNetwork | 0.25 |
+| FindServices | 0.3 |
+| ExploitService | 0.25 |
+| FindData | 0.5 |
+| ExfiltrateData | 0.25 |
+| BlockIP | 1 |
+
+### Consecutive Actions Threshold
+
+Determines the maximum number of consecutive occurrences of an action type within the time window before detection is triggered.
+
+| Action Type | Consecutive Threshold |
+|---|---|
+| ScanNetwork | 2 |
+| FindServices | 3 |
+| ExfiltrateData | 2 |
+
+### Repeated Action Threshold
+
+Applies to certain action types and defines the number of times a specific action must appear in the **entire episode** before it can be considered for detection.
+
+| Action Type | Repeated Threshold |
+|---|---|
+| ExploitService | 2 |
+| FindData | 2 |
+
+## Decision Logic
+
+The system monitors actions and maintains a history of recent ones within the time window:
+
+1. If an action's **Type Ratio Threshold** is met within the time window **or** it exceeds the **Consecutive Actions Threshold**, it is evaluated for detection.
+2. If the action type has a **Repeated Action Threshold** and has not been repeated enough times in the episode, it is ignored.
+3. If an action meets the conditions above, it is subject to detection based on its predefined **Detection Probability**.
+4. Actions that do not meet any threshold conditions are ignored, ensuring that occasional activity does not lead to unnecessary detections.
+
+This approach ensures that only repeated or excessive behavior is flagged, reducing false positives while maintaining a realistic monitoring system.
+
+## API Reference
+
+::: netsecgame.game.global_defender
diff --git a/docs/index.md b/docs/index.md
index 0d37d63e..3632f6c5 100644
--- a/docs/index.md
+++ b/docs/index.md
@@ -1,187 +1,30 @@
 # NetSecGame
-The NetSecGame (Network Security Game) is a framework for training and evaluation of AI agents in network security tasks (both offensive and defensive). It is built with [CYST](https://pypi.org/project/cyst/) network simulator and enables rapid development and testing of AI agents in highly configurable scenarios. Examples of implemented agents can be seen in the submodule [NetSecGameAgents](https://github.com/stratosphereips/NetSecGameAgents/tree/main).
 
-## Installation Guide
-It is recommended to run the environment in the Docker container. The up-to-date image can be found in [Dockerhub](https://hub.docker.com/r/stratosphereips/netsecgame).
-```bash
-docker pull stratosphereips/netsecgame
-```
-#### Building the image locally
-Optionally, you can build the image locally with:
-```bash 
-docker build -t netsecgame:local .
-```
+The **NetSecGame** (Network Security Game) is a framework for training and evaluation of AI agents in network security tasks. It supports both offensive and defensive operations in a dynamic, multi-agent environment built on top of the [CYST](https://pypi.org/project/cyst/) network simulator.
 
-### Installing from source
-In case you need to modify the environment and run directly, we recommend installing it in a virtual environment (Python venv or Conda):
-#### Python venv
-1. Create new virtual environment
-```bash
-python -m venv <venv-name>
-```
-2. Activate newly created venv
-```bash
-source <venv-name>/bin/activate
-```
+## Key Features
 
-#### Conda
-1.  Create new conda environment
-```bash
-conda create --name aidojo python==3.12
-```
-2. Activate newly created conda env
-```bash
-conda activate aidojo
-```
+- **Multi-agent support** — Multiple attackers, defenders, and benign users can interact simultaneously in real-time.
+- **Configurable scenarios** — Choose from predefined network topologies or define custom ones using CYST configurations.
+- **Standard RL interface** — Agents submit [Actions](game_components.md) and receive [Observations](architecture.md#observations) with state, reward, and terminal flag.
+- **Rich game state** — The [GameState](architecture.md#gamestate) captures networks, hosts, services, data, and firewall blocks — far richer than flat vector representations.
+- **Stochastic global defender** — A built-in [SIEM-like defender](global_defender.md) provides realistic opposition without requiring a trained agent.
+- **Dynamic topologies** — Optionally randomize IP addresses between episodes to prevent overfitting.
+- **Trajectory recording** — Record and analyze full episode trajectories for debugging and research.
 
-### After preparing virtual environment, install using pip:
-```bash
-pip install -e .
-```
+## Quick Links
 
-## Quick Start
-A task configuration YAML file is required for starting the NetSecGame environment.  For the first step, the example task configuration is recommended:
-
-### Example Configuration
-```yaml
-# Example of the task configuration for NetSecGame
-# The objective of the Attacker in this task is to locate specific data
-# and exfiltrate it to a remote C&C server.
-# The scenario starts AFTER the initial breach of the local network
-# (the attacker controls 1 local device + the remote C&C server).
-
-coordinator: 
-  agents: 
-    Attacker: # Configuration of 'Attacker' agents
-      max_steps: 25 # timout set for the role `Attacker`
-      goal: # Definition of the goal state
-        description: "Exfiltrate data from Samba server to remote C&C server."
-        known_networks: []
-        known_hosts: []
-        controlled_hosts: []
-        known_services: {}
-        known_data: {213.47.23.195: [[User1,DataFromServer1]]} # winning condition
-        known_blocks: {}
-      start_position: # Definition of the starting state (keywords "random" and "all" can be used)
-        known_networks: []
-        known_hosts: []
-        controlled_hosts: [213.47.23.195, random] # keyword 'random' will be replaced by randomly selected IP during initilization
-        known_services: {}
-        known_data: {}
-        known_blocks: {}
-
-    Defender:
-      goal:
-        description: "Block all attackers"
-        known_networks: []
-        known_hosts: []
-        controlled_hosts: []
-        known_services: {}
-        known_data: {}
-        known_blocks: {213.47.23.195: 'all_attackers'}
-
-      start_position:
-        known_networks: []
-        known_hosts: []
-        controlled_hosts: []
-        known_services: {}
-        known_data: {}
-        blocked_ips: {}
-        known_blocks: {}
-
-env: # Environment configuration
-  scenario: 'two_networks_tiny' # use the smallest topology for this example
-  use_global_defender: False # Do not use global SIEM Defender
-  use_dynamic_addresses: False # Do not randomize IP addresses
-  use_firewall: True # Use firewall
-  save_trajectories: False # Do not store trajectories
-  required_players: 1 # Minimal number of agents required to start the game
-  rewards: # Configurable reward function
-    success: 100
-    step: -1
-    fail: -10
-    false_positive: -5 
-```
-### Starting the NetSecGame
-With the configuration ready the environment can be started in selected port
-#### In Docker container
-```bash
-docker run -d --rm --name nsg-server\
-  -v $(pwd)/examples/example_task_configuration.yaml:/netsecgame/netsecenv_conf.yaml \
-  -v $(pwd)/logs:/netsecgame/logs \
-  -p 9000:9000 stratosphereips/netsecgame \
-  --debug_level="INFO"
-```
-`--name nsg-server`: specifies the name of the container
-
-`-v <your-configuration-yaml>:/netsecgame/netsecenv_conf.yaml` : Mapping of the configuration file
-
-`-v $(pwd)/logs:/netsecgame/logs`: Mapping of the folder where logs are stored
-
-` -p <selected-port>:9000`: Mapping of the port in which the server runs
-
-`--debug_level` is an optional parameter to control the logging level `--debug_level=["DEBUG", "INFO", "WARNING", "CRITICAL"]` (default=`"INFO"`):
-##### Running on Windows (with Docker desktop)
-When running on Windows, Docker desktop is required.
-```batch
-docker run -d --rm --name netsecgame-server ^
-  -p 9000:9000 ^
-  -v "%cd%\examples\example_task_configuration.yaml:/netsecgame/netsecenv_conf.yaml" ^
-  -v "%cd%\logs:/netsecgame/logs" ^
-  stratosphereips/netsecgame:latest \
-  --debug_level="INFO"
-```
-
-#### Locally
-The environment can be started locally from the root folder of the repository with the following command:
-```bash
-python3 -m netsecgame.game.worlds.NetSecGame \
-  --task_config=./examples/example_task_configuration.yaml \
-  --game_port=9000 \
-  --debug_level="INFO"
-```
-Upon which the game server is created on `localhost:9000` to which the agents can connect to interact in the NetSecGame.
-
-### Components of the NetSecGame Environment
-The NetSecGame has several components in the following files:
-```
-├── netsecgame/
-|	├── agents/
-|		├── base_agent.py # Basic agent class. Implements the API for agent-server communication
-|	├── game/
-|		├── scenarios/
-|		    ├── three_net_scenario.py
-|		    ├── two_nets.py
-|		    ├── two_nets_tiny.py
-|		    ├── two_nets_small.py
-|		    ├── one_net.py
-|		├── worlds/
-|   		├── NetSecGame.py # (NSG) basic simulation 
-|   		├── RealWorldNetSecGame.py # Extension of `NSG` - runs actions in the *network of the host computer*
-|   		├── CYSTCoordinator.py # Extension of `NSG` - runs simulation in CYST engine.
-|   		├── WhiteBoxNetSecGame.py # Extension of `NSG` - provides agents with full list of actions upon registration.
-|		├── config_parser.py # NSG task configuration parser
-|		├── configuration_manager.py # Manages the loading and access of game configuration.
-|		├── coordinator.py # Core game server. Not to be run as stand-alone world (see worlds/)
-|		├── agent_server.py # Class used for serving the agents when connecting to the game run by the GameCoordinator.
-|	    ├── global_defender.py # Stochastic (non-agentic defender)
-|	├── game_components.py # contains basic building blocks of the environment
-|	├── utils/
-|		├── utils.py
-|		├── trajectory_recorder.py
-|		├── trajectory_analysis.py
-|		├── aidojo_log_colorizer.py
-|		├── gameplay_graphs.py
-|		├── actions_parser.py
-|		├── log_parser.py
-```
-Some compoments are described in detail in following sections:
-
-- [Architecture](architecture.md) describes the architecture and important design decisions of the NetSecGame
-- [Configuration](configuration.md) describes the task and scenario configuration for NetSecGame
-- [API Reference](game_components.md) provides details of the API
+| | |
+|---|---|
+| **[Getting Started](getting_started.md)** | Installation, configuration, and running your first game |
+| **[Architecture](architecture.md)** | Game components, actions, preconditions, and observations |
+| **[Configuration](configuration.md)** | Detailed environment and task configuration reference |
+| **[Global Defender](global_defender.md)** | Stochastic detection system and thresholds |
+| **[API Reference](game_components.md)** | Auto-generated API documentation |
+| **[NetSecGameAgents](https://github.com/stratosphereips/NetSecGameAgents)** | Reference agent implementations (Random, Tabular, LLM, DQN) |
+| **[GitHub](https://github.com/stratosphereips/NetSecGame)** | Project source code, issue tracker, and contributions |
+| **[PyPI](https://pypi.org/project/netsecgame/)** | Latest NetSecGame releases on the Python Package Index |
 
 ## About
-This code was developed at the [Stratosphere Laboratory at the Czech Technical University in Prague](https://www.stratosphereips.org/). The project is supported by Strategic Support for the Development of Security Research in the Czech Republic 2019–2025 (IMPAKT 1) program, by the Ministry of the Interior of the Czech Republic under No.
-VJ02010020 – AI-Dojo: Multi-agent testbed for the
-research and testing of AI-driven cyber security technologies.
\ No newline at end of file
+
+This project was developed at the [Stratosphere Laboratory at the Czech Technical University in Prague](https://www.stratosphereips.org/). The project is supported by Strategic Support for the Development of Security Research in the Czech Republic 2019–2025 (IMPAKT 1) program, by the Ministry of the Interior of the Czech Republic under No. VJ02010020 – AI-Dojo: Multi-agent testbed for the research and testing of AI-driven cyber security technologies.
\ No newline at end of file
diff --git a/docs/trajectory_recorder.md b/docs/trajectory_recorder.md
new file mode 100644
index 00000000..e9885411
--- /dev/null
+++ b/docs/trajectory_recorder.md
@@ -0,0 +1,4 @@
+# Trajectory Recorder
+The `trajectory_recorder` module provides functionality to save and manage game trajectories, which capture the full sequence of states, actions, and rewards during an episode.
+
+::: netsecgame.utils.trajectory_recorder
diff --git a/docs/utils.md b/docs/utils.md
new file mode 100644
index 00000000..ac0c788e
--- /dev/null
+++ b/docs/utils.md
@@ -0,0 +1,4 @@
+# General Utilities
+This module provides various utility functions used throughout the NetSecGame framework.
+
+::: netsecgame.utils.utils
diff --git a/mkdocs.yml b/mkdocs.yml
index 28d4596e..bfbcf0c7 100644
--- a/mkdocs.yml
+++ b/mkdocs.yml
@@ -5,17 +5,23 @@ theme:
 
 nav:
   - Home: index.md
+  - Getting Started: getting_started.md
   - Architecture: architecture.md
   - Configuration: configuration.md
-  - Documentation: 
-    - game_components.md
-    - agent_server.md
-    - configuration_manager.md
-    - config_parser.md
-    - game_coordinator.md
-    - worlds:
-      - NetSecGame.md
-      - WhiteBoxNetSecGame.md
+  - Global Defender: global_defender.md
+  - API Reference:
+    - Game Components: game_components.md
+    - Base Agent: base_agent.md
+    - Agent Server: agent_server.md
+    - Game Coordinator: game_coordinator.md
+    - Configuration Manager: configuration_manager.md
+    - Config Parser: config_parser.md
+    - Worlds:
+      - NetSecGame: NetSecGame.md
+      - WhiteBoxNetSecGame: WhiteBoxNetSecGame.md
+    - Utils:
+      - Utils: utils.md
+      - Trajectory Recorder: trajectory_recorder.md
       
 
 plugins:
diff --git a/netsecgame/agents/base_agent.py b/netsecgame/agents/base_agent.py
index d8509019..28c3f25f 100644
--- a/netsecgame/agents/base_agent.py
+++ b/netsecgame/agents/base_agent.py
@@ -10,7 +10,6 @@
 
 class BaseAgent(ABC):
     """
-    Author: Ondrej Lukas, ondrej.lukas@aic.cvut.cz
     Basic agent for the network based NetSecGame environment. Implemenets communication with the game server.
     """
 
@@ -77,7 +76,7 @@ def make_step(self, action: Action) -> Optional[Observation]:
             None: If no observation is received from the server.
 
         Raises:
-            Any exceptions raised by the `communicate` method are propagated.
+            Exception: Any exceptions raised by the `communicate` method are propagated.
         """
         _, observation_dict, _ = self.communicate(action)
         if observation_dict:
diff --git a/netsecgame/utils/__init__.py b/netsecgame/utils/__init__.py
new file mode 100644
index 00000000..8bf9f92b
--- /dev/null
+++ b/netsecgame/utils/__init__.py
@@ -0,0 +1 @@
+# Initialize utils module
diff --git a/netsecgame/utils/trajectory_recorder.py b/netsecgame/utils/trajectory_recorder.py
index a2616934..14935237 100644
--- a/netsecgame/utils/trajectory_recorder.py
+++ b/netsecgame/utils/trajectory_recorder.py
@@ -50,8 +50,8 @@ def add_step(self, action: Action, reward: float, next_state: GameState, end_rea
             end_reason (Optional[str]): Reason for episode end, if applicable.
         """
         self.logger.debug(f"Adding step to trajectory for {self.agent_name}")
-        # Assuming Action and GameState have .as_dict property or method as in original code
-        # In original code: action.as_dict, next_state.as_dict
+        if len(self._data["trajectory"]["states"]) == 0:
+            self.logger.warning("The current action (id:{action.id}) is being added as the first step, but the initial state has not been recorded yet. Please call add_initial_state() at the beginning of the episode.")
         self._data["trajectory"]["actions"].append(action.as_dict)
         self._data["trajectory"]["rewards"].append(reward)
         self._data["trajectory"]["states"].append(next_state.as_dict)
@@ -77,16 +77,18 @@ def get_trajectory(self) -> Dict[str, Any]:
         """
         return self._data
 
-    def save_to_file(self, location: str = "./logs/trajectories") -> None:
+    def save_to_file(self, location: str = "./logs/trajectories", filename:str=None) -> None:
         """
         Saves the recorded trajectory to a JSONL file.
 
         Args:
             location (str): Directory to save the file. Defaults to "./logs/trajectories".
+            filename (str): Name of the file to save. If None, defaults to "{datetime.now():%Y-%m-%d}_{self.agent_name}_{self.agent_role}".
         """
-        filename = f"{datetime.now():%Y-%m-%d}_{self.agent_name}_{self.agent_role}"
+        if filename is None:
+            filename = f"{datetime.now():%Y-%m-%d}_{self.agent_name}_{self.agent_role}"
         try:
             store_trajectories_to_jsonl(self._data, location, filename)
-            self.logger.info(f"Trajectory stored in {os.path.join(location, filename)}.jsonl")
+            self.logger.debug(f"Trajectory stored in {os.path.join(location, filename)}.jsonl")
         except Exception as e:
             self.logger.error(f"Failed to store trajectory: {e}")