This project focuses on fine-tuning and using Gemma 3 models for reasoning tasks.
├── src/
│ ├── inference/ # Inference related code
│ ├── training/ # Training related code
│ ├── notebooks/ # Jupyter notebooks
│ ├── datasets/ # Dataset handling code
│ ├── logging/ # Logging utilities
│ └── *.py # Core modules
├── tests/ # Test directory
└── docs/ # Documentation
- Python 3.10 or later
- Poetry for dependency management
- Run the setup script:
./project_setup.sh
This will:
- Create and activate the virtual environment at ~/venvs/gemma3 (if it doesn't exist)
- Install Poetry and project dependencies
- Configure the Jupyter kernel
If you prefer to set up manually:
- Create and activate the virtual environment:
python -m venv ~/venvs/gemma3
source ~/venvs/gemma3/bin/activate
- Install Poetry:
pip install --upgrade pip
pip install poetry
- Configure Poetry to use the existing venv:
poetry config virtualenvs.create false
- Install project dependencies:
poetry lock
poetry install
- Install the Jupyter kernel:
python -m ipykernel install --user --name=gemma3 --display-name "Gemma3"
Run with:
python -m src.training.optimized_gemma_training
Run tensorboard with:
tensorboard --logdir=logs/gemma_tensorboard
This project is licensed under the MIT License - see the LICENSE file for details.
This module provides a centralized logging system for Gemma training scripts. It offers a unified interface for logging different types of information, with configurable handlers and formatters.
- Unified API: Consistent interface for all logging needs
- Configurable: YAML-based configuration system
- Asynchronous Logging: Background thread processing for high-frequency events
- Multiple Output Formats: Console, File, CSV, and TensorBoard
- Model Metrics: Specialized logging for GRPO/RL training metrics
- Validation: Periodic evaluation on held-out data
from src.logging import initialize, get_training_logger
# Initialize with default configuration
initialize()
# Get a logger
logger = get_training_logger()
logger.info("Training started")
from src.logging import log_model_output
log_model_output(
question="What is 2+2?",
true_answer="4",
model_output="<reasoning>2+2=4</reasoning>\n<answer>4</answer>",
reasoning="2+2=4"
)
from src.logging import log_reward
log_reward(
reward_name="correctness_reward",
values=[1.0, 0.0, 1.0],
samples=["sample1", "sample2", "sample3"]
)
from src.logging import log_training_progress
log_training_progress(
step=current_step,
metrics={
"loss": current_loss,
"accuracy": current_accuracy,
"learning_rate": current_lr
}
)
from src.logging import log_reward_metrics, log_generation_metrics
# Log reward metrics with trend analysis
log_reward_metrics(
step=current_step,
rewards_dict={
"correctness_reward": batch_correctness_rewards,
"anti_repetition_reward": batch_anti_repetition_rewards,
"topic_relevance_reward": batch_topic_relevance_rewards
}
)
# Log generation diversity metrics
log_generation_metrics(
step=current_step,
generations=batch_generations
)
from src.logging import run_validation
run_validation(
step=current_step,
model=model,
tokenizer=tokenizer
)
The logging system can be configured using a YAML file. Here's a basic example:
version: 1
disable_existing_loggers: false
root:
level: INFO
handlers: [console]
loggers:
logging.training:
level: INFO
handlers: [console, file]
handlers:
console:
class: logging.StreamHandler
formatter: standard
level: INFO
file:
class: logging.handlers.RotatingFileHandler
formatter: standard
filename: logs/training.log
maxBytes: 10485760 # 10MB
backupCount: 5
formatters:
standard:
format: "%(asctime)s - %(name)s - %(levelname)s - %(message)s"
datefmt: "%Y-%m-%d %H:%M:%S"
To use a custom configuration:
from src.logging import initialize
initialize("path/to/custom_config.yaml")
See example.py
for a complete demonstration of the logging system in action.
The system includes several custom handlers:
- CSVHandler: Writes model outputs to CSV files asynchronously
- TensorBoardHandler: Logs metrics to TensorBoard for visualization
These are automatically configured when specified in the YAML configuration file.
The system provides specialized support for:
- Reward Tracking: Statistics for reward functions over time
- Generation Metrics: Diversity, format adherence, etc.
- Validation: Periodic evaluation on held-out data
These metrics are designed specifically for GRPO/RL-style training where traditional loss metrics may not be informative.
- Python 3.6+
- PyYAML
- TensorBoard (optional, for visualization)
- NumPy (for metrics calculation)