The Six Honest Signals
From the bare timestamps of a real chat — without reading a word of the content — you compute response times, the balance of contributions and the ratio of giving to taking. An intuition about a group becomes a measurement.
In a nutshell
What: You export a WhatsApp chat (e.g. your class or project group) and let Python read patterns from it — not what is talked about, but how: who writes how much? How quickly are replies sent? Does everyone respond to each other, or do some just talk to themselves?
Why: Exactly these patterns are among the six honest signals that predict whether a group flourishes creatively. You turn a gut feeling about a group into numbers — and lay the groundwork for the symbiont tool in the next chapter.
You need: an exported chat history (.txt) and Python with pandas. Runs in the
browser too (Pyodide).
What it's about
Put five people in a room and give them a task — after a few minutes something arises that you can almost touch: a mood, a rhythm, a kind of chemistry. For a long time that was considered unmeasurable. This chapter shows that it is measurable, because a group's chemistry lives in its honest signals — in things no one steers consciously.
Six such signals predict surprisingly well whether a group becomes productive: whether there is a strong, connecting leadership and whether it shifts rather than sticking to one person; how quickly members react to each other; how evenly the contributions are shared; how honestly moods may be expressed; and how much the group develops a language of its own. In this activity you pick out three of them that can already be computed from bare timestamps — response time, contribution balance and giving and taking. The leadership question (betweenness centrality) follows in Activity 8.1.
Patterns only, never content — and only with consent
This analysis reads no content, only timestamps and senders. Even so: a chat history belongs to everyone who writes in it. Get the group's consent, anonymise the names (Person A, B, C …), and don't share the results outside. That is the golden rule of this book: what you find out about others belongs to them, not to you. The cleanest choice is a chat you are a member of yourself.
A little background
What we read from timestamps. Every chat line carries three plain facts: when, who, and (the one we ignore here) what. From when and who alone, three measures arise:
- Contribution balance — how evenly the contributions are spread across people. Does everyone talk roughly the same amount, or do a few dominate? A very lopsided chat is a warning sign.
- Response time — the median time between one message and the next from a different person. Short response times mean people listen to each other and react.
- Giving and taking — does someone react to others roughly as often ("giving") as their own messages trigger reactions from others ("taking")? We count both within a short time window. Whoever's messages create a lot of resonance but who reacts little themselves stands out here — as does whoever barely takes part in the lively back-and-forth.
These are rough but honest approximations — they turn "the group feels lively" into a number you can compare.
Exporting the chat
- Start the export. In WhatsApp open the chat → Menu → "More" → "Export chat" →
"Without media". You get a
.txtfile. - Consent & anonymise. Ask the group. Before the analysis, replace the real names with A, B, C … — or let the code do it (see below).
- Put the file in place. As
chat.txt, in the same folder as the script. - Analyse. Run the script and read off the three measures.
Analysing with Python
The code splits each line into a time and a sender, throws the content away and computes the three measures. Full, commented code on GitHub.
import re
import pandas as pd
WINDOW = 60 # minutes: what still counts as a "response"
# WhatsApp line: "15.03.26, 17:44 - Mira: Text..."
pattern = re.compile(r"^(\d{1,2}\.\d{1,2}\.\d{2,4}),?\s+(\d{1,2}:\d{2})\s+-\s+([^:]+):")
rows = []
for line in open("chat.txt", encoding="utf-8"):
m = pattern.match(line)
if m:
date, clock, name = m.groups()
ts = pd.to_datetime(f"{date} {clock}", dayfirst=True, errors="coerce")
rows.append((ts, name.strip()))
df = pd.DataFrame(rows, columns=["time", "who"]).dropna().sort_values("time").reset_index(drop=True)
# --- anonymise: real names -> A, B, C ... ---
names = {n: chr(65+i) for i, n in enumerate(df["who"].unique())}
df["who"] = df["who"].map(names)
# 1) contribution balance
share = df["who"].value_counts() / len(df)
p = share.sort_values().values # inequality as Gini
gini = 1 - 2*sum((len(p)-i-0.5)/len(p) * pi for i, pi in enumerate(p))
# 2) response time: median to the next message from ANOTHER person
df["dt_min"] = df["time"].diff().dt.total_seconds() / 60
df["other"] = df["who"].ne(df["who"].shift())
response_time = df.loc[df["other"], "dt_min"].median()
# 3) giving & taking within the time window
time, who, n = df["time"].tolist(), df["who"].tolist(), len(df)
gives = {p: 0 for p in set(who)} # I react to another person
takes = {p: 0 for p in set(who)} # my message triggers reactions
for i in range(n):
for j in range(i-1, -1, -1): # giving: do I react?
if who[j] != who[i]:
if (time[i]-time[j]).total_seconds()/60 <= WINDOW:
gives[who[i]] += 1
break
responders = set() # taking: who reacts to me?
for k in range(i+1, n):
if (time[k]-time[i]).total_seconds()/60 > WINDOW or who[k] == who[i]:
break
responders.add(who[k])
takes[who[i]] += len(responders)
print("Contribution share (%):")
print((share*100).round(1).sort_values(ascending=False).to_string())
print(f"\nInequality (Gini, 0=equal .. 1=one person): {gini:.2f}")
print(f"Median response time between people: {response_time:.0f} min\n")
print("Giving & taking (given / received):")
for pn in sorted(set(who), key=lambda x: -takes[x]):
g, e = gives[pn], takes[pn]
print(f" {pn} given {g:2d} received {e:2d} G/T {g/e:.2f}" if e
else f" {pn} given {g:2d} received {e:2d} G/T –")
What you should see
A ranking of who writes what percentage; a Gini value (near 0 = everyone writes a similar amount, near 1 = one person dominates); a median response time; and, per person, how many reactions they give and how much resonance they receive. In the bundled example chat (five people, a project week) the organiser writes the most at around 38 %, the quietest participant only about 10 % (Gini ≈ 0.23), and the median response time is around 15 minutes. Giving and taking shows the nice part: the organiser receives about twice as much resonance as she gives (G/T ≈ 0.5) — her messages move the group — while the quietest participant barely joins the lively back-and-forth (he reacts only hours later and triggers no quick reaction, hence "–"). If a chat tips into "one broadcasts, everyone else is silent", you see it at once in the high Gini.
Worksheet
A gut feeling becomes a number
- Before you run the code: guess from your gut who writes the most in your chat and whether the group is "balanced". Then compare with the numbers — where were you off?
- Why do we deliberately read only the timestamps here and not the content? Name one advantage for the analysis and one for privacy.
- Response time measures the gap to the next message from a different person. Why would it be wrong to simply take the gap to the next message of any kind?
- Someone has a giving-to-taking ratio of 0.3. What does that mean — and is it automatically "bad"? Give one harmless explanation.
- Which of the six signals can you not compute from timestamps, and why do you need a different tool for it (Chapter 7 or 8)?
Show solution
1. Individual. The point is the experience that intuition is often roughly right, but regularly off on "who exactly writes how much" and on the response times — which is precisely why we measure.
2. Analysis: the patterns (who, when, how quickly) are honest signals that can hardly be steered — unlike words, which can be dressed up. Privacy: without content there is no reading of private messages; you analyse only structure, not secrets.
3. Because two quick messages from the same person are not a response but a continued monologue. "Reply" means someone else reacts — only that measures listening and engaging with each other.
4. The person reacts to others noticeably less often than others react to them — they "take" more than they "give". That is not automatically bad: perhaps they are the idea-giver or moderator whose contributions trigger many replies. Interpret numbers, don't judge with them.
5. The connecting leadership (betweenness centrality) can't be read off the bare order — for that you need the network of who is connected to whom (Activity 8.1). And the shared language / honest mood needs a content analysis (the symbiont tool from Chapter 7).
When it sticks
| Problem | Likely cause & fix |
|---|---|
No lines recognised (empty df) | The export's date/time format differs (phone language, 12-hour format). Look at one sample line and adjust the pattern (e.g. AM/PM). |
| Multi-line messages count wrongly | Continuation lines without a timestamp are correctly ignored — only the line header counts as a message. That is intended. |
| System lines ("… created the group") | Have no "Name:" part and fall through the pattern. Filter them out additionally if needed. |
| Response time unrealistically large | Long overnight gaps skew the median less than the mean — hence use the median. Optionally exclude messages more than 8 h apart. |
| Emojis/special characters break the reading | Open the file with encoding="utf-8" (already so in the code); on errors add errors="ignore". |
Food for thought
- You have measured a group without reading a single word of its content — from the pattern of communication alone. That is exactly what the author once did with his Deloitte mailbox (Chapter 8): not the content of the emails, but their pattern. Structure often reveals more than content.
- These measures are approximations, not verdicts. A high Gini does not mean "bad group" — perhaps someone is giving an important presentation right then. Numbers are a reason to look closer, not the verdict itself.
- Because these signals are so hard to fake, they are powerful — and therefore delicate. Whoever does such analyses carries responsibility: aggregated for the team, personal only for the person. That is the line that turns observation into trust.
Extension
- Over time. Split the chat into weeks and see whether the contribution balance moves — does the dominant person change, or does the leadership stick? The moving is itself one of the six signals.
- Daily rhythm. Plot the messages by time of day. When is the group most active? A first look at the "pulse" of a community.
- Bridge to 8.1. From the "who replies to whom" of this activity you can draw a network directly — and compute the connecting leadership (betweenness) in it. That is exactly what you do next.
- Bridge to Chapter 7. Now you have measured the structure. The symbiont tool of the next chapter also reads the content and assigns each person a role — Bee, Ant, Butterfly, Capybara, Leech.