Hidden Signals · Companion site ← All activities DE·EN
Chapter 5 · The Chemistry of Teams · Activity 5.2

The Six Honest Signals

From the bare timestamps of a real chat — without reading a word of the content — you compute response times, the balance of contributions and the ratio of giving to taking. An intuition about a group becomes a measurement.

Duration 90 min Difficulty medium Group alone or in pairs Fully digital S

In a nutshell

What: You export a WhatsApp chat (e.g. your class or project group) and let Python read patterns from it — not what is talked about, but how: who writes how much? How quickly are replies sent? Does everyone respond to each other, or do some just talk to themselves?

Why: Exactly these patterns are among the six honest signals that predict whether a group flourishes creatively. You turn a gut feeling about a group into numbers — and lay the groundwork for the symbiont tool in the next chapter.

You need: an exported chat history (.txt) and Python with pandas. Runs in the browser too (Pyodide).

What it's about

Put five people in a room and give them a task — after a few minutes something arises that you can almost touch: a mood, a rhythm, a kind of chemistry. For a long time that was considered unmeasurable. This chapter shows that it is measurable, because a group's chemistry lives in its honest signals — in things no one steers consciously.

Six such signals predict surprisingly well whether a group becomes productive: whether there is a strong, connecting leadership and whether it shifts rather than sticking to one person; how quickly members react to each other; how evenly the contributions are shared; how honestly moods may be expressed; and how much the group develops a language of its own. In this activity you pick out three of them that can already be computed from bare timestamps — response time, contribution balance and giving and taking. The leadership question (betweenness centrality) follows in Activity 8.1.

Patterns only, never content — and only with consent

This analysis reads no content, only timestamps and senders. Even so: a chat history belongs to everyone who writes in it. Get the group's consent, anonymise the names (Person A, B, C …), and don't share the results outside. That is the golden rule of this book: what you find out about others belongs to them, not to you. The cleanest choice is a chat you are a member of yourself.

A little background

What we read from timestamps. Every chat line carries three plain facts: when, who, and (the one we ignore here) what. From when and who alone, three measures arise:

These are rough but honest approximations — they turn "the group feels lively" into a number you can compare.

Exporting the chat

  1. Start the export. In WhatsApp open the chat → Menu → "More" → "Export chat" → "Without media". You get a .txt file.
  2. Consent & anonymise. Ask the group. Before the analysis, replace the real names with A, B, C … — or let the code do it (see below).
  3. Put the file in place. As chat.txt, in the same folder as the script.
  4. Analyse. Run the script and read off the three measures.

Analysing with Python

The code splits each line into a time and a sender, throws the content away and computes the three measures. Full, commented code on GitHub.

import re
import pandas as pd

WINDOW = 60   # minutes: what still counts as a "response"

# WhatsApp line:  "15.03.26, 17:44 - Mira: Text..."
pattern = re.compile(r"^(\d{1,2}\.\d{1,2}\.\d{2,4}),?\s+(\d{1,2}:\d{2})\s+-\s+([^:]+):")

rows = []
for line in open("chat.txt", encoding="utf-8"):
    m = pattern.match(line)
    if m:
        date, clock, name = m.groups()
        ts = pd.to_datetime(f"{date} {clock}", dayfirst=True, errors="coerce")
        rows.append((ts, name.strip()))

df = pd.DataFrame(rows, columns=["time", "who"]).dropna().sort_values("time").reset_index(drop=True)

# --- anonymise: real names -> A, B, C ... ---
names = {n: chr(65+i) for i, n in enumerate(df["who"].unique())}
df["who"] = df["who"].map(names)

# 1) contribution balance
share = df["who"].value_counts() / len(df)
p = share.sort_values().values                         # inequality as Gini
gini = 1 - 2*sum((len(p)-i-0.5)/len(p) * pi for i, pi in enumerate(p))

# 2) response time: median to the next message from ANOTHER person
df["dt_min"] = df["time"].diff().dt.total_seconds() / 60
df["other"] = df["who"].ne(df["who"].shift())
response_time = df.loc[df["other"], "dt_min"].median()

# 3) giving & taking within the time window
time, who, n = df["time"].tolist(), df["who"].tolist(), len(df)
gives = {p: 0 for p in set(who)}     # I react to another person
takes = {p: 0 for p in set(who)}     # my message triggers reactions
for i in range(n):
    for j in range(i-1, -1, -1):                       # giving: do I react?
        if who[j] != who[i]:
            if (time[i]-time[j]).total_seconds()/60 <= WINDOW:
                gives[who[i]] += 1
            break
    responders = set()                                 # taking: who reacts to me?
    for k in range(i+1, n):
        if (time[k]-time[i]).total_seconds()/60 > WINDOW or who[k] == who[i]:
            break
        responders.add(who[k])
    takes[who[i]] += len(responders)

print("Contribution share (%):")
print((share*100).round(1).sort_values(ascending=False).to_string())
print(f"\nInequality (Gini, 0=equal .. 1=one person): {gini:.2f}")
print(f"Median response time between people:        {response_time:.0f} min\n")
print("Giving & taking (given / received):")
for pn in sorted(set(who), key=lambda x: -takes[x]):
    g, e = gives[pn], takes[pn]
    print(f"  {pn}  given {g:2d}   received {e:2d}   G/T {g/e:.2f}" if e
          else f"  {pn}  given {g:2d}   received {e:2d}   G/T  –")

What you should see

A ranking of who writes what percentage; a Gini value (near 0 = everyone writes a similar amount, near 1 = one person dominates); a median response time; and, per person, how many reactions they give and how much resonance they receive. In the bundled example chat (five people, a project week) the organiser writes the most at around 38 %, the quietest participant only about 10 % (Gini ≈ 0.23), and the median response time is around 15 minutes. Giving and taking shows the nice part: the organiser receives about twice as much resonance as she gives (G/T ≈ 0.5) — her messages move the group — while the quietest participant barely joins the lively back-and-forth (he reacts only hours later and triggers no quick reaction, hence "–"). If a chat tips into "one broadcasts, everyone else is silent", you see it at once in the high Gini.

Worksheet

A gut feeling becomes a number

  1. Before you run the code: guess from your gut who writes the most in your chat and whether the group is "balanced". Then compare with the numbers — where were you off?
  2. Why do we deliberately read only the timestamps here and not the content? Name one advantage for the analysis and one for privacy.
  3. Response time measures the gap to the next message from a different person. Why would it be wrong to simply take the gap to the next message of any kind?
  4. Someone has a giving-to-taking ratio of 0.3. What does that mean — and is it automatically "bad"? Give one harmless explanation.
  5. Which of the six signals can you not compute from timestamps, and why do you need a different tool for it (Chapter 7 or 8)?
Show solution

1. Individual. The point is the experience that intuition is often roughly right, but regularly off on "who exactly writes how much" and on the response times — which is precisely why we measure.

2. Analysis: the patterns (who, when, how quickly) are honest signals that can hardly be steered — unlike words, which can be dressed up. Privacy: without content there is no reading of private messages; you analyse only structure, not secrets.

3. Because two quick messages from the same person are not a response but a continued monologue. "Reply" means someone else reacts — only that measures listening and engaging with each other.

4. The person reacts to others noticeably less often than others react to them — they "take" more than they "give". That is not automatically bad: perhaps they are the idea-giver or moderator whose contributions trigger many replies. Interpret numbers, don't judge with them.

5. The connecting leadership (betweenness centrality) can't be read off the bare order — for that you need the network of who is connected to whom (Activity 8.1). And the shared language / honest mood needs a content analysis (the symbiont tool from Chapter 7).

When it sticks

ProblemLikely cause & fix
No lines recognised (empty df)The export's date/time format differs (phone language, 12-hour format). Look at one sample line and adjust the pattern (e.g. AM/PM).
Multi-line messages count wronglyContinuation lines without a timestamp are correctly ignored — only the line header counts as a message. That is intended.
System lines ("… created the group")Have no "Name:" part and fall through the pattern. Filter them out additionally if needed.
Response time unrealistically largeLong overnight gaps skew the median less than the mean — hence use the median. Optionally exclude messages more than 8 h apart.
Emojis/special characters break the readingOpen the file with encoding="utf-8" (already so in the code); on errors add errors="ignore".

Food for thought

Extension

← 5.1 Heartbeat and Stress 7.1 The Symbiont Analyzer →