Marco Polo solved one of the hardest problems in personal communication: how do you stay genuinely close to people across distance and time zones? Video messages you can watch whenever you're ready, that feel warmer than text, more thoughtful than a call. Tens of millions of families use it. But Marco Polo still lives in a single language — grandma records in Spanish, grandchild listens and understands nothing. Babel is what happens when the warmth of video messaging meets real-time translation: the people you love, in the language you understand.
| Feature | Marco Polo | Babel |
|---|---|---|
| Video messaging | ✓ Core feature — asynchronous video you record and watch anytime | Focused on real-time voice and text, not async video |
| Real-time translation | ✗ No translation — every message is monolingual | ✓ Built-in real-time translation for voice and text conversations |
| Live voice calls | ✗ No live calls — async video only | ✓ Real-time voice with live translation across languages |
| Multilingual families | Video feels warm but language gap remains unbridged | ✓ Designed for exactly this — family members speak their native tongue |
| Async messaging | ✓ Core strength — record anytime, watch anytime | Primarily real-time — async text translation in supported modes |
| Language support | ✗ One language per video — whatever the speaker uses | ✓ Multiple languages simultaneously in a single conversation |
| Group conversations | ✓ Group video threads supported | ✓ Group voice and text with per-person language preferences |
| Text messaging | Secondary feature — primarily a video platform | ✓ Text with real-time translation as first-class feature |
| Primary use case | Families and friends staying close across distance | Multilingual individuals, communities, and teams |
Marco Polo understood something that most messaging apps missed: people don't just want to exchange information, they want to feel close. A video of your dad laughing at the kitchen table carries something that a text message can't. The app was built around that insight and it works — families and friend groups use it to maintain genuine intimacy across thousands of miles.
But video intimacy runs into a hard wall when language is different. A grandmother recording a Marco Polo in Portuguese can send the warmth of her voice, the expression on her face, the familiar way she gestures when she talks. What she can't send is comprehension. Her grandchildren who grew up speaking English will see her and feel the love, but they won't understand the words. The emotional connection is real; the communicative connection is broken.
This is the gap Babel fills. Not by replacing the intimacy of video, but by making conversations across language lines as natural as conversations within a single language.
The clearest illustration of where these apps fit comes from immigrant families. First-generation parents who arrived speaking Vietnamese, Spanish, Tagalog, or Urdu raise children who grow up dominant in English. Marco Polo is popular in these families because video feels more personal than a text chain, and grandparents don't need to type. But the conversations are necessarily short and simple — the child says a few words they know in the grandparent's language, the grandparent responds slowly, everyone smiles through the gap.
Babel doesn't try to replace that warmth. What it adds is the ability to have a real conversation. Not the pantomime of mutual goodwill that often substitutes for communication between generations with different languages, but an actual exchange where the child asks a question in English and hears the answer in English, while grandma hears the question in her language and answers in hers. Both are fully themselves in the conversation.
Used together, the two apps serve different needs: Marco Polo for the "here's a video of what my life looks like today" moment, Babel for the "let's actually talk about this" conversation. For multilingual families, both matter.
Marco Polo is fundamentally asynchronous. You record when you have time. The other person watches when they have time. Nobody needs to be available simultaneously. This is genuinely useful — many of the most meaningful personal conversations happen across time zones where a live call would require someone to be awake at 3am.
Babel is fundamentally real-time. Two people speaking in their native languages, hearing each other in their native languages, as the conversation happens. There's no delay between speaking and being understood. This opens a different kind of conversation — the dynamic, responsive, interruptible kind where meaning is negotiated in the moment rather than packaged and delivered.
Neither approach is strictly better. A parent watching their child's video message and replaying the funny moment three times is using the async format correctly. A child and grandparent discovering they can actually talk to each other without a translator between them is using real-time correctly. The question is which kind of conversation you're trying to have.
Join thousands of families and friends who use Babel to have conversations that language used to make impossible.
Join the Waitlist →Babel — real-time translation for every conversation
Join Waitlist