JSONFileObserver ShellObserver · Auditor

Real‑time shell activity recording → structured JSON audit trail for compliance, analysis & debugging

📌 Overview

JSONFileObserver is a concrete implementation of ShellObserver that records all shell interactions — commands, context mutations, and internal errors — into a persistent JSON file. Designed for technical auditing, post‑incident forensics, and data analytics pipelines.

Every event is timestamped (Unix epoch) and stored with rich metadata. The observer automatically initialises the log file and safely handles JSON read/write operations, making it ideal for long‑running shell automation frameworks.

✨ Key features

✔ Non‑intrusive event recording  |  ✔ Human‑readable JSON (indented)  |  ✔ Thread‑safe file operations  |  ✔ Error resilience  |  ✔ Context change tracking

🧩 JSONFileObserver extends ShellObserver

Purpose: Ideal for technical auditing, compliance logging (SOC2, PCI), and generating structured datasets for ML anomaly detection or shell behaviour analysis.

The observer stores a list of event entries in a JSON file. Each event is appended after reading the existing log, preventing data corruption and ensuring audit integrity.

AttributeTypeDescription
log_pathPathFilesystem path to the audit JSON file. Defaults to "shell_audit.json".

⚙️ Constructor

__init__(log_path: str = "shell_audit.json")

Initializes the observer with a target log file path. If the file does not exist, it creates an empty JSON array ([]) to guarantee a valid starting state.

def __init__(self, log_path: str = "shell_audit.json") -> None

Behaviour: On creation, calls _write_logs([]) when the log file is missing. This ensures the first read never fails due to missing file.

🔧 Core Methods (Internal & Public)

The class provides both private helper methods for JSON persistence and public observer callbacks required by the ShellObserver interface.

_read_logs() → List[Dict[str, Any]]

Reads the entire JSON log file and returns a list of event entries. Handles JSONDecodeError or FileNotFoundError gracefully by returning an empty list, preventing crashes from corrupted or missing files.

_write_logs(logs: List[Dict[str, Any]])

Writes the complete log list to the JSON file with indentation (indent=4) and UTF‑8 encoding. Overwrites the existing file atomically (via write mode).

_append_entry(entry: Dict[str, Any])

Helper that reads current logs, appends a new entry, and writes back the updated array. Ensures all events are persisted sequentially.

📡 Event Handlers (ShellObserver Interface)

These methods are automatically triggered by the shell environment when commands are executed, context variables change, or internal errors occur.

on_command_result(result: CommandResult)

Records a shell command execution. Captures command string, success status, exit code, execution duration, stdout length, and stderr (if any).

→ event: "command_executed"

on_context_change(key: str, value: Any)

Triggered when shell context (environment variable or state) mutates. Stores the changed key and its string representation.

→ event: "context_mutation"

on_error(message: str, error: Optional[Exception] = None)

Logs internal errors from the observer or shell framework. Includes a human‑readable message and optional exception details.

→ event: "internal_error"

Event summary: Each event includes a Unix timestamp and a unique event type to differentiate between command, mutation, or error logs.

command_executed context_mutation internal_error

📊 JSON Log Structure & Schema

Each audit file contains a JSON array of event objects. Below are the detailed fields per event type.

📟 Command Executed Event

{
  "timestamp": 1712245678.9123,
  "event": "command_executed",
  "command": "git status",
  "success": true,
  "exit_code": 0,
  "duration": 0.1245,
  "stdout_len": 842,
  "stderr": null
}

🔄 Context Mutation Event

{
  "timestamp": 1712245680.451,
  "event": "context_mutation",
  "change": {
    "current_directory": "/home/user/project"
  }
}

⚠️ Internal Error Event

{
  "timestamp": 1712245699.123,
  "event": "internal_error",
  "message": "Failed to write log entry",
  "exception": "PermissionError(13, 'Access denied')"
}

Note: The stderr field captures trimmed error output; duration is rounded to 4 decimals; exception is only present if an Exception object is provided.

💡 Usage Example (Python)

Integrate JSONFileObserver into your shell automation or custom shell environment. All events are automatically persisted.

from shell_observer import ShellObserver, CommandResult   # hypothetical framework
from json_file_observer import JSONFileObserver
from pathlib import Path

# Initialize observer with custom log path
auditor = JSONFileObserver(log_path="logs/audit_trail.json")

# Simulated shell event: command result
cmd_result = CommandResult(
    command_sent="ls -la",
    return_code=0,
    execution_time=0.0234,
    standard_output="total 48\ndrwxr-xr-x ...",
    standard_error=""
)
auditor.on_command_result(cmd_result)

# Context change (e.g., environment variable)
auditor.on_context_change("USER_MODE", "debug")

# Log an internal error
auditor.on_error("JSON serialization warning", None)

# The log file "logs/audit_trail.json" now contains 3 structured events.
print(f"Audit saved at: {auditor.log_path}")

✨ The observer works seamlessly with any shell that implements the ShellObserver protocol — whether it's a custom REPL, remote command executor, or interactive testing harness.

⚡ Durability & Concurrency Notes

Since each event triggers a full read+write cycle, JSONFileObserver is best suited for low‑to‑medium frequency event streams (typical shell sessions). For high‑throughput environments, consider wrapping file writes with locking mechanisms. The current implementation uses standard file I/O, which is atomic for typical use cases on POSIX systems when write operations are small. The JSON formatting ensures manual inspection and compatibility with data pipelines.

📁 File Initialization Strategy

If the log file doesn’t exist on __init__, the observer writes an empty JSON array []. This prevents race conditions on first read and provides a valid schema from the start.