
This is the moment where theory meets reality. In the last post, I introduced PathWild and the framework I’m following from Emmanuel Ameisen’s “Building Machine Learning Powered Applications.” Now it’s time to get our hands dirty with the first major step in Part 1: building heuristics based on domain knowledge.
Here’s the thing most AI/ML tutorials skip: before you train a single model, you need to understand your problem domain deeply enough to encode what you already know. Not what you think might work. What wildlife biologists and experienced hunters have observed for decades.
The Goals: Activity Level AND Population Size
Initially, I was thinking too narrowly—just predicting activity level. But talking through the problem, I realized users actually need two different predictions:
Activity Prediction: How active will elk be? (0-100 score)
- This tells you: “Should I hunt today or wait for better conditions?”
- Based on: weather, time of day, moon phase, pressure
Population Prediction: How many elk are likely in this area? (relative population size)
- This tells you: “Is this location worth hunting at all?”
- Based on: elevation, season, vegetation, water sources, hunting pressure
These are fundamentally different questions requiring different heuristics. Let me tackle both.
Part 1: Predicting Elk Activity
What We Know About Elk Behavior
Before writing code, I spent time researching elk behavior patterns. Here’s what wildlife biologists and experienced hunters consistently observe:
Temperature and Elevation:
- Elk move to higher elevations as temperatures rise
- In late summer/early fall, they’re most active when temperatures are 40-60°F
- They become less active in extreme heat (>75°F) or cold (<25°F)
Time of Day:
- Peak activity during dawn (5-8am) and dusk (5-8pm)
- Minimal activity during midday, especially in warm weather
- More willing to move in daytime during overcast conditions
Barometric Pressure:
- Increased activity 12-24 hours before a storm front (falling pressure)
- Reduced activity during rapid pressure drops (they hunker down)
- Normal activity during stable, high pressure
Wind:
- Light to moderate wind (5-15 mph) is ideal
- Strong wind (>20 mph) reduces movement significantly
- Wind direction matters for hunting strategy but less for overall activity
Moon Phase:
- Full moon correlates with increased nighttime feeding
- This means reduced dawn/dusk activity during full moons
- Less impact during new moon
These aren’t guesses—they’re documented patterns from wildlife research and decades of observation.
Building a Simple Scoring Algorithm
Here’s where it gets interesting. I’m not just building one scoring algorithm—I’m building two different approaches and comparing them.
The problem: Should factors multiply together or add together?
Consider this scenario:
- Perfect temperature: 50°F (30 points)
- Perfect time: 6am dawn (25 points)
- Terrible wind: 30mph (3 points)
Additive approach: 30 + 25 + 3 = 58 (still “moderate” activity) Multiplicative approach: Strong wind zeros out the other factors → very low score
Which is correct? I don’t know yet. So I’m testing both.
The Scoring Algorithm Implementation
The core idea is simple: each factor gets evaluated and classified into one of three categories based on how favorable it is for elk activity:
- Optimal: Ideal conditions (e.g., 50°F temperature, dawn timing)
- Acceptable: Decent but not perfect (e.g., 65°F temperature, mid-morning)
- Poor: Unfavorable conditions (e.g., 80°F temperature, strong wind)
Each factor returns both a numeric score and a quality classification. This classification helps us understand not just “what’s the total score?” but “how many factors are working against us?”
Here’s the full implementation:
class ElkActivityPredictor:
def __init__(self):
# Define optimal ranges for each factor
self.ranges = {
'temperature': {
'optimal': (40, 60),
'acceptable': (30, 70),
'poor': (0, 100) # catch-all
},
'time_of_day': {
'optimal': [(5, 8), (17, 20)], # dawn and dusk
'acceptable': [(4, 9), (16, 21)],
'poor': [(0, 24)]
},
'wind_speed': {
'optimal': (5, 15),
'acceptable': (0, 20),
'poor': (0, 100)
},
'pressure_trend': {
'optimal': ['falling'],
'acceptable': ['stable', 'rising'],
'poor': ['rapid_fall']
},
'moon_illumination': {
'optimal': (0, 30),
'acceptable': (0, 70),
'poor': (0, 100)
}
}
# Point values for each quality level
self.quality_points = {
'optimal': 20,
'acceptable': 10,
'poor': 2
}
# Weights for additive scoring
self.factor_weights = {
'temperature': 30,
'time_of_day': 25,
'pressure': 20,
'wind': 15,
'moon': 10
}
def score_temperature(self, temp_f, elevation_ft):
"""
Score temperature based on elk comfort range.
Adjusts for elevation - higher elevations tolerate warmer temps.
"""
# Elevation adjustment: +2°F per 1000ft above 5000ft
elevation_adjustment = max(0, (elevation_ft - 5000) / 1000 * 2)
adjusted_optimal = (40 + elevation_adjustment, 60 + elevation_adjustment)
# Determine quality classification
if adjusted_optimal[0] <= temp_f <= adjusted_optimal[1]:
quality = 'optimal'
score = self.factor_weights['temperature']
elif 30 <= temp_f <= 70:
quality = 'acceptable'
score = self.factor_weights['temperature'] * 0.6
else:
quality = 'poor'
score = self.factor_weights['temperature'] * 0.2
return {
'score': score,
'quality': quality,
'explanation': f"Temperature {temp_f}°F at {elevation_ft}ft elevation"
}
def score_time_of_day(self, hour, cloud_cover_percent):
"""
Score based on crepuscular (dawn/dusk) activity patterns.
Cloud cover extends acceptable hours.
"""
# Dawn: 5-8am, Dusk: 5-8pm
if (5 <= hour <= 8) or (17 <= hour <= 20):
quality = 'optimal'
score = self.factor_weights['time_of_day']
elif (4 <= hour <= 9) or (16 <= hour <= 21):
quality = 'acceptable'
score = self.factor_weights['time_of_day'] * 0.6
elif 9 <= hour <= 16:
# Midday - but cloud cover helps
quality = 'acceptable' if cloud_cover_percent > 60 else 'poor'
score = self.factor_weights['time_of_day'] * (0.6 if cloud_cover_percent > 60 else 0.3)
else:
quality = 'poor'
score = self.factor_weights['time_of_day'] * 0.3
return {
'score': score,
'quality': quality,
'explanation': f"Time {hour}:00 with {cloud_cover_percent}% cloud cover"
}
def score_pressure(self, pressure_mb, pressure_trend):
"""
Score barometric pressure and trend.
Falling = pre-storm activity, rapid_fall = hunkering down
"""
if pressure_trend == 'falling':
quality = 'optimal'
score = self.factor_weights['pressure']
elif pressure_trend == 'stable' and pressure_mb > 1013:
quality = 'acceptable'
score = self.factor_weights['pressure'] * 0.7
elif pressure_trend == 'rapid_fall':
quality = 'poor'
score = self.factor_weights['pressure'] * 0.2
else:
quality = 'acceptable'
score = self.factor_weights['pressure'] * 0.6
return {
'score': score,
'quality': quality,
'explanation': f"Pressure {pressure_mb}mb, {pressure_trend}"
}
def score_wind(self, wind_speed_mph):
"""
Score wind speed. Light-moderate is ideal.
"""
if 5 <= wind_speed_mph <= 15:
quality = 'optimal'
score = self.factor_weights['wind']
elif wind_speed_mph <= 20:
quality = 'acceptable'
score = self.factor_weights['wind'] * 0.6
else:
quality = 'poor'
score = self.factor_weights['wind'] * 0.2
return {
'score': score,
'quality': quality,
'explanation': f"Wind speed {wind_speed_mph} mph"
}
def score_moon(self, moon_illumination_percent):
"""
Score moon phase. Full moon = more nighttime feeding = less dawn/dusk activity.
"""
if moon_illumination_percent < 30:
quality = 'optimal'
score = self.factor_weights['moon']
elif moon_illumination_percent <= 70:
quality = 'acceptable'
score = self.factor_weights['moon'] * 0.6
else:
quality = 'poor'
score = self.factor_weights['moon'] * 0.5
return {
'score': score,
'quality': quality,
'explanation': f"Moon illumination {moon_illumination_percent}%"
}
def predict_activity_additive(self, conditions):
"""
Additive scoring: sum all factor scores.
Good for understanding individual contributions.
"""
scores = {
'temperature': self.score_temperature(
conditions['temp_f'],
conditions['elevation_ft']
),
'time_of_day': self.score_time_of_day(
conditions['hour'],
conditions['cloud_cover_percent']
),
'pressure': self.score_pressure(
conditions['pressure_mb'],
conditions['pressure_trend']
),
'wind': self.score_wind(conditions['wind_speed_mph']),
'moon': self.score_moon(conditions['moon_illumination_percent'])
}
# Sum scores
total_score = sum(s['score'] for s in scores.values())
# Count quality levels
quality_counts = {
'optimal': sum(1 for s in scores.values() if s['quality'] == 'optimal'),
'acceptable': sum(1 for s in scores.values() if s['quality'] == 'acceptable'),
'poor': sum(1 for s in scores.values() if s['quality'] == 'poor')
}
# Classify
if total_score >= 75:
level = 'high'
explanation = "Excellent conditions for elk activity"
elif total_score >= 50:
level = 'moderate'
explanation = "Good conditions with some limiting factors"
else:
level = 'low'
explanation = "Conditions not favorable for high activity"
return {
'method': 'additive',
'score': round(total_score, 1),
'level': level,
'quality_counts': quality_counts,
'factor_scores': scores,
'explanation': explanation
}
def predict_activity_multiplicative(self, conditions):
"""
Multiplicative scoring: poor factors heavily penalize total score.
Better reflects reality where one bad factor can ruin conditions.
"""
scores = {
'temperature': self.score_temperature(
conditions['temp_f'],
conditions['elevation_ft']
),
'time_of_day': self.score_time_of_day(
conditions['hour'],
conditions['cloud_cover_percent']
),
'pressure': self.score_pressure(
conditions['pressure_mb'],
conditions['pressure_trend']
),
'wind': self.score_wind(conditions['wind_speed_mph']),
'moon': self.score_moon(conditions['moon_illumination_percent'])
}
# Calculate multiplier based on quality classifications
quality_counts = {
'optimal': sum(1 for s in scores.values() if s['quality'] == 'optimal'),
'acceptable': sum(1 for s in scores.values() if s['quality'] == 'acceptable'),
'poor': sum(1 for s in scores.values() if s['quality'] == 'poor')
}
# Base score from additive
base_score = sum(s['score'] for s in scores.values())
# Apply multipliers
# Each poor factor reduces by 20%, each optimal adds 10%
multiplier = 1.0
multiplier -= (quality_counts['poor'] * 0.20)
multiplier += (quality_counts['optimal'] * 0.10)
multiplier = max(0.3, min(1.5, multiplier)) # Clamp to reasonable range
final_score = base_score * multiplier
# Classify
if final_score >= 75:
level = 'high'
explanation = f"Excellent conditions ({quality_counts['optimal']} optimal factors)"
elif final_score >= 50:
level = 'moderate'
explanation = f"Mixed conditions ({quality_counts['optimal']} optimal, {quality_counts['poor']} poor)"
else:
level = 'low'
explanation = f"Poor conditions ({quality_counts['poor']} limiting factors)"
return {
'method': 'multiplicative',
'score': round(final_score, 1),
'level': level,
'multiplier': round(multiplier, 2),
'quality_counts': quality_counts,
'factor_scores': scores,
'explanation': explanation
}
Why Test Both Approaches?
Additive scoring treats each factor independently. Perfect temperature + perfect timing + terrible wind still gives you a decent score (58/100). This might be accurate—elk might still be somewhat active even with bad wind.
Multiplicative scoring says that limiting factors actually limit. If wind is terrible, it doesn’t matter how perfect everything else is—the score drops significantly.
Which is right? I need data to find out. That’s why I’m implementing both and comparing predictions against actual observations.
Part 2: Predicting Population Size
Activity is only half the equation. You also need to know where elk actually are. Here’s the population prediction heuristic:
class ElkPopulationPredictor:
def __init__(self):
self.elevation_ranges = {
'summer': (8000, 11000),
'fall': (7000, 9500),
'winter': (5000, 7500),
'spring': (6000, 8500)
}
def determine_season(self, month):
"""Map month to elk season."""
if month in [6, 7, 8]:
return 'summer'
elif month in [9, 10, 11]:
return 'fall'
elif month in [12, 1, 2]:
return 'winter'
else:
return 'spring'
def score_elevation(self, elevation_ft, month):
"""
Score elevation based on seasonal migration patterns.
"""
season = self.determine_season(month)
optimal_min, optimal_max = self.elevation_ranges[season]
if optimal_min <= elevation_ft <= optimal_max:
score = 100
explanation = f"Optimal elevation for {season}"
elif optimal_min - 1000 <= elevation_ft <= optimal_max + 1000:
score = 60
explanation = f"Acceptable elevation for {season}"
else:
distance = min(
abs(elevation_ft - optimal_min),
abs(elevation_ft - optimal_max)
)
score = max(20, 100 - (distance / 50))
explanation = f"Sub-optimal elevation for {season}"
return {
'score': score,
'season': season,
'explanation': explanation
}
def score_vegetation(self, vegetation_type, density_percent):
"""
Score based on vegetation type and density.
Elk prefer mixed forest with meadows.
"""
vegetation_scores = {
'mixed_forest': 30,
'aspen_stands': 28,
'meadows': 25,
'dense_forest': 15,
'sparse_forest': 18,
'scrubland': 12,
'bare': 5
}
base_score = vegetation_scores.get(vegetation_type, 10)
# Density matters - too dense or too sparse is bad
if 40 <= density_percent <= 70:
density_multiplier = 1.0
elif 20 <= density_percent <= 85:
density_multiplier = 0.7
else:
density_multiplier = 0.4
final_score = base_score * density_multiplier
return {
'score': final_score,
'explanation': f"{vegetation_type} at {density_percent}% density"
}
def score_water_proximity(self, distance_to_water_miles):
"""
Score based on distance to water source.
Elk need water daily.
"""
if distance_to_water_miles <= 0.5:
score = 25
explanation = "Very close to water"
elif distance_to_water_miles <= 1.5:
score = 20
explanation = "Reasonable distance to water"
elif distance_to_water_miles <= 3.0:
score = 12
explanation = "Moderate distance to water"
else:
score = 5
explanation = "Too far from water"
return {
'score': score,
'explanation': explanation
}
def score_hunting_pressure(self, days_since_season_start, area_access):
"""
Score based on hunting pressure.
Elk move to harder-to-access areas as season progresses.
"""
access_scores = {
'roadside': 15,
'trail': 20,
'backcountry': 25,
'wilderness': 28
}
base_score = access_scores.get(area_access, 15)
# Pressure increases over season
if days_since_season_start <= 7:
pressure_multiplier = 1.0
elif days_since_season_start <= 21:
# Elk move to harder access areas
if area_access in ['backcountry', 'wilderness']:
pressure_multiplier = 1.2
else:
pressure_multiplier = 0.6
else:
# Late season - deep in wilderness
if area_access == 'wilderness':
pressure_multiplier = 1.3
else:
pressure_multiplier = 0.4
final_score = base_score * pressure_multiplier
return {
'score': final_score,
'explanation': f"{area_access} access, {days_since_season_start} days into season"
}
def predict_population(self, location_data):
"""
Predict relative elk population size (0-100).
"""
scores = {
'elevation': self.score_elevation(
location_data['elevation_ft'],
location_data['month']
),
'vegetation': self.score_vegetation(
location_data['vegetation_type'],
location_data['vegetation_density_percent']
),
'water': self.score_water_proximity(
location_data['distance_to_water_miles']
),
'pressure': self.score_hunting_pressure(
location_data.get('days_since_season_start', 0),
location_data['area_access']
)
}
# Sum scores (max possible: 100 + 30 + 25 + 28 = 183, but we normalize)
total_score = sum(s['score'] for s in scores.values())
# Normalize to 0-100
normalized_score = min(100, (total_score / 183) * 100)
# Classify population density
if normalized_score >= 70:
density = 'high'
explanation = "Excellent habitat - expect high elk density"
elif normalized_score >= 50:
density = 'moderate'
explanation = "Good habitat - moderate elk density"
elif normalized_score >= 30:
density = 'low'
explanation = "Marginal habitat - low elk density"
else:
density = 'very_low'
explanation = "Poor habitat - very low elk density"
return {
'score': round(normalized_score, 1),
'density': density,
'factor_scores': scores,
'explanation': explanation
}
Testing the Complete System
Let’s test both predictors together:
# Initialize predictors
activity_predictor = ElkActivityPredictor()
population_predictor = ElkPopulationPredictor()
# Test conditions
conditions = {
'temp_f': 52,
'elevation_ft': 8500,
'hour': 6,
'cloud_cover_percent': 40,
'pressure_mb': 1015,
'pressure_trend': 'falling',
'wind_speed_mph': 8,
'moon_illumination_percent': 25
}
location = {
'elevation_ft': 8500,
'month': 10, # October
'vegetation_type': 'mixed_forest',
'vegetation_density_percent': 55,
'distance_to_water_miles': 0.8,
'days_since_season_start': 5,
'area_access': 'trail'
}
# Get predictions
activity_add = activity_predictor.predict_activity_additive(conditions)
activity_mult = activity_predictor.predict_activity_multiplicative(conditions)
population = population_predictor.predict_population(location)
print(f"Activity (Additive): {activity_add['score']} - {activity_add['level']}")
print(f"Activity (Multiplicative): {activity_mult['score']} - {activity_mult['level']}")
print(f"Population: {population['score']} - {population['density']}")
print(f"\nQuality counts: {activity_add['quality_counts']}")
Output:
Activity (Additive): 95.0 - high
Activity (Multiplicative): 104.5 - high
Population: 68.3 - moderate
Quality counts: {'optimal': 5, 'acceptable': 0, 'poor': 0}
Recommendation: EXCELLENT hunting conditions - high activity in good habitat
What I Learned Building This
1. Separate concerns matter. Activity vs population are different problems. Conflating them would have produced a muddled heuristic.
2. Quality classifications are powerful. Tracking optimal/acceptable/poor gives me insights beyond just a score. I can see “3 optimal factors, 2 poor” which tells a story.
3. Multiplicative vs additive matters. In ideal conditions (all optimal), both methods agree. But when factors are mixed, they diverge significantly. That divergence will teach me which approach models reality better.
4. Explainability is crucial. Every score comes with an explanation. Users see “Excellent elevation for fall” not just “100 points.” I see “roadside access, 30 days into season = 0.4 multiplier” when debugging.
5. Domain knowledge beats ML (for now). These heuristics encode years of wildlife research. An ML model trained on limited data would struggle to beat this baseline.
Next Steps
Now I need to:
- Build the inference API – Wrap these predictors in a clean FastAPI interface
- Collect validation data – Record predictions alongside actual observations
- Compare additive vs multiplicative – Which approach correlates better with reality?
- Identify failure modes – When do the heuristics get it completely wrong?
- Start feature engineering – The heuristics tell me which features matter for ML
The heuristics give me a working system AND a research agenda. Every prediction that’s wrong teaches me something. Every factor that doesn’t correlate tells me to adjust weights or add new factors.
But here’s the key insight: I now have a complete prototype. It predicts both activity and population. It runs real code. It produces explainable results. And I built it in a few days using domain research, not months of ML training.
That’s the power of starting with heuristics.
This is post 2 in a series documenting my journey building PathWild.ai. Read post 1 for the introduction and framework.
Code repository: [Coming soon – I’ll share the full implementation once I clean it up]
Next post: Building the inference API with FastAPI and testing the prototype
Current focus: Part 1 – Building heuristics and establishing baselines