This Week's Sponsor:

Turbulence Forecast

Know before you go. Get detailed turbulence forecasts for your exact route, now available 5 days in advance.


Posts tagged with "developers"

Hands-On: How Apple’s New Speech APIs Outpace Whisper for Lightning-Fast Transcription

Late last Tuesday night, after watching F1: The Movie at the Steve Jobs Theater, I was driving back from dropping Federico off at his hotel when I got a text:

Can you pick me up?

It was from my son Finn, who had spent the evening nearby and was stalking me in Find My. Of course, I swung by and picked him up, and we headed back to our hotel in Cupertino.

On the way, Finn filled me in on a new class in Apple’s Speech framework called SpeechAnalyzer and its SpeechTranscriber module. Both the class and module are part of Apple’s OS betas that were released to developers last week at WWDC. My ears perked up immediately when he told me that he’d tested SpeechAnalyzer and SpeechTranscriber and was impressed with how fast and accurate they were.

It’s still early days for these technologies, but I’m here to tell you that their speed alone is a game changer for anyone who uses voice transcription to create text from lectures, podcasts, YouTube videos, and more. That’s something I do multiple times every week for AppStories, NPC, and Unwind, generating transcripts that I upload to YouTube because the site’s built-in transcription isn’t very good.

What’s frustrated me with other tools is how slow they are. Most are built on Whisper, OpenAI’s open source speech-to-text model, which was released in 2022. It’s cheap at under a penny per one million tokens, but isn’t fast, which is frustrating when you’re in the final steps of a YouTube workflow.

An SRT file generated by Yap.

An SRT file generated by Yap.

I asked Finn what it would take to build a command line tool to transcribe video and audio files with SpeechAnalyzer and SpeechTranscriber. He figured it would only take about 10 minutes, and he wasn’t far off. In the end, it took me longer to get around to installing macOS Tahoe after WWDC than it took Finn to build Yap, a simple command line utility that takes audio and video files as input and outputs SRT- and TXT-formatted transcripts.

Yesterday, I finally took the Tahoe plunge and immediately installed Yap. I grabbed the 7GB 4K video version of AppStories episode 441, which is about 34 minutes long, and ran it through Yap. It took just 45 seconds to generate an SRT file. Here’s Yap ripping through nearly 20% of an episode of NPC in 10 seconds:

Replay

Next, I ran the same file through VidCap and MacWhisper, using its V2 Large and V3 Turbo models. Here’s how each app and model did:

App Transcripiton Time
Yap 0:45
MacWhisper (Large V3 Turbo) 1:41
VidCap 1:55
MacWhisper (Large V2) 3:55

All three transcription workflows had similar trouble with last names and words like “AppStories,” which LLMs tend to separate into two words instead of camel casing. That’s easily fixed by running a set of find and replace rules, although I’d love to feed those corrections back into the model itself for future transcriptions.

Once transcribed, a video can be used to generate additional formats like outlines.

Once transcribed, a video can be used to generate additional formats like outlines.

What stood out above all else was Yap’s speed. By harnessing SpeechAnalyzer and SpeechTranscriber on-device, the command line tool tore through the 7GB video file a full 2.2× faster than MacWhisper’s Large V3 Turbo model, with no noticeable difference in transcription quality.

At first blush, the difference between 0:45 and 1:41 may seem insignificant, and it arguably is, but those are the results for just one 34-minute video. Extrapolate that to running Yap against the hours of Apple Developer videos released on YouTube with the help of yt-dlp, and suddenly, you’re talking about a significant amount of time. Like all automation, picking up a 2.2× speed gain one video or audio clip at a time, multiple times each week, adds up quickly.

Whether you’re producing video for YouTube and need subtitles, generating transcripts to summarize lectures at school, or doing something else, SpeechAnalyzer and SpeechTranscriber – available across the iPhone, iPad, Mac, and Vision Pro – mark a significant leap forward in transcription speed without compromising on quality. I fully expect this combination to replace Whisper as the default transcription model for transcription apps on Apple platforms.

To test Apple’s new model, install the macOS Tahoe beta, which currently requires an Apple developer account, and then install Yap from its GitHub page.


Hand Crafted: Don’t Count Developers Out

Source: Apple.

Source: Apple.

We’re days away from WWDC, and I’m excited. As much as I enjoy a good Apple hardware event, it’s WWDC’s focus on software that I truly love. But what WWDC means to me runs much deeper than the OS updates we’ll hear about next week. Of course, Apple’s announcements are a big part of what makes WWDC a special time of the year, but for me, it’s overshadowed by the people.

I’ve been to every WWDC since 2013. That first year, I sat on the sidewalk at 3 AM on a cold pre-dawn morning. I hardly knew anyone in the Apple developer community then, but after hours in that line and attending the events surrounding the conference, I’d gotten to know a few developers.

By the time 2016 rolled around, I was writing at MacStories and interviewing developers for the site, including the founders of Workflow, which became Shortcuts. Now, they’re building Sky. After that WWDC, Federico hit the nail on the head in Issue 37 of MacStories Weekly:

…there’s something special about meeting someone you’ve known for a long time exclusively through the Internet. While I thought I knew some people and had made some special friendships through the years, getting to know them in person is different.

He’s right, and even though WWDC is much smaller than it used to be, I look forward to the chance to get to know the developers whose apps we’ve covered, meet new people, and reconnect with old friends.

What’s special about so many of the developers I’ve met over the years is how much they care about their craft. They sweat all the details. Over the years, we’ve seen many of them go from novices to the makers of apps with big, passionate followings among our readers.

We’ve also seen developers and their importance to Apple’s hardware success undervalued by the very company whose platforms they’re so dedicated to. That’s not new, but it’s gotten palpably worse as the years have worn on.

Since WWDC 2024, that trend has collided head-on with the rise of artificial intelligence. I imagine that our reaction to learning that Apple had scraped MacStories and every other website to train their LLMs was familiar to developers who have felt taken advantage of for years. That was a bitter pill to swallow, but one of the upsides of the experience is that over the past year, it’s forced me to spend a lot of time thinking about creativity, work, and our relationship with technology.

To hear the AI fans tell it, I, the developers we write about, and nearly everyone else will be out of jobs before long. Some days, that threat feels very real, and others, not so much. Still, it’s caused a lot of anxiety for a lot of people.

However, as I get ready to head to this year’s WWDC, I’m far more optimistic than I was after WWDC 2024. I don’t expect AI to replace our friends in the indie developer community; far from it. That’s because what sets a great app apart from the pack on the App Store is the care and humanity that’s poured into it. I’ve yet to see a vibe-coded app that comes anywhere close. Those apps will simply join the vast sea of mediocrity that has always made up a big part of the App Store. Instead, I expect AI will help solo developers and small teams tackle bigger problems that were once the exclusive domain of bigger teams with more resources.

I realize this all may sound like blasphemy to anyone who’s either devoted to AI or dead set against it, but I believe there’s room for AI to serve the artist instead of the other way around. So despite the challenges developers, writers, and others are facing, I’m heartened by the many excellent apps I’ve tried in the past year and look forward to meeting and reconnecting with as many of their creators as I can next week.

If you see me and Federico wandering about, stop us to say hi. We’d love to hear what you’re working on.


2025 Apple Design Awards Winners and Finalists Announced

As WWDC approaches, Apple has announced the finalists for its annual Apple Design Awards, and in a departure from recent years, the winners too.

This year, there are six categories, and each category has a winning app and game, along with four finalists. Unlike last year, there is no Spatial Computing category this year. The 2025 ADA winners and finalists are:

Delight and Fun

Winners:

Finalists:

Innovation

Winners:

Finalists:

Interaction

Winner:

  • App
    • Taobao by Zhejiang Taobao Network
  • Game

Finalists:

Inclusivity

Winner:

Finalists:

Social Impact

Winners:

  • App
  • Game
    • Neva by Developer Digital

Finalists:

Visuals and Graphics

Winners:

Finalists:

The winners and finalists include a broad range of games and apps, including some from smaller developers including Lumy, Denim Art of Fauna, Skate City: New York, as well as titles from bigger publishers.

I’m glad that Apple has announced the finalists for the last few years. Winning an ADA is a big achievement for any developer, but it’s also nice to know who the finalists are because it’s quite an honor among the many apps that could have been chosen, too. Plus as a fan of apps, Apple’s longer finalist list always reminds me of an app or two that I haven’t tried yet. Congratulations to all of this year’s Apple Design Award winners and finalists.


EU Sets DMA Compliance Deadline in App Store Anti-Steering Dispute

Last month, the European Commission (EC) fined Apple €500 million for violating the Digital Markets Act. Today, the EC issued its full 67-page ruling on the matter, giving Apple until July 23 to pay the fine or face accruing interest on the penalty.

The ruling focuses on Apple’s anti-steering rules, which were the focus of the contempt order recently entered by a U.S. District Court Judge in California. According to the EC:

Apple has not substantiated any security concerns. Apple simply states that some limitations, such as linking out only to a website that the app developer owns or has responsibility for, are allegedly grounded in security reasons. However, Apple does not explain why the app developer’s website is more secure than a third party website which the app developer has taken the conscious decision to link out to. It also does not explain why this limitation is objectively necessary and proportionate to protect the end user’s security and therefore has not provided any adequate justifications in this regard.

(EC ruling at p. 22). In other words, “the App Store isn’t more secure than the web just because you say it is.”

Apple has until June 22 to bring the App Store into compliance with the EC’s ruling or face additional periodic penalties (EC ruling at p. 67). As we reported in April, Apple has said that it intends to appeal the EC’s ruling.


Apple Spotlights Four of the Distinguished Swift Student Challenge Winners

Image: Apple.

Image: Apple.

Earlier this year, Apple selected 350 students from around the world as winners of its annual Swift Student Challenge. From that talented pool, Apple picks 50 Distinguished Winners whose projects stand out from the others. Today, Apple highlighted the work of four of them: Taiki Hamamoto, Marina Lee, Luciana Ortiz Nolasco, and Nahom Worku.

Taiki Hamamoto. Image: Apple.

Taiki Hamamoto. Image: Apple.

Taiki Hamamoto built an app playground to teach people about the Hanafuda, a Japanese card game that he discovered many of his friends didn’t know. According to Apple’s press release:

While Hamamoto stayed true to the game’s classic floral iconography, he also added a modern touch to the gameplay experience, incorporating video game concepts like hit points (HP) that resonate with younger generations. SwiftUI’s DragGesture helped him implement dynamic, highly responsive effects like cards tilting and glowing during movement, making the gameplay feel natural and engaging. He’s also experimenting with making Hanafuda Tactics playable on Apple Vision Pro.

Marina Lee. Image: Apple.

Marina Lee. Image: Apple.

Marina Lee, is a computer science student at the University of Southern California. A call from her grandmother who was alerted to evacuate her home because of wildfires in the L.A. area inspired Lee to create EvacuMate to help users prepare an emergency checklist in case of evacuations like her grandmother’s. In addition:

Lee integrated the iPhone camera roll into the app so users can upload copies of important documents, and added the ability to import emergency contacts through their iPhone contacts list. She also included resources on topics like checking air quality levels and assembling a first-aid kit.

Luciana Ortiz Nolasco. Image: Apple.

Luciana Ortiz Nolasco. Image: Apple.

Luciana Ortiz Nolasco built BreakDownCosmic:

a virtual gathering place where users can add upcoming astronomical events around the world to their calendars, earn medals for accomplishing “missions,” and chat with fellow astronomers about what they see.

Ortiz Nolasco who is 15 and from Nuevo León, Mexico will attend WWDC with the other Distinguished Student Winners and plans to continue work on BreakDownCosmic when she returns home with the goal of releasing it on the App Store.

Nahom Worku. Image: Apple.

Nahom Worku. Image: Apple.

Nahom Worku grew up in Ethiopia and Canada and learned to code during the pandemic. Worku’s submission for the Swift Student Challenge app playground, AccessEd, is designed to offer educational resources in places where Internet connectivity doesn’t exist or is spotty.

Built using Apple’s machine learning and AI tools, such as Core ML and the Natural Language framework, the app recommends courses based on a student’s background, creating a truly personalized experience.

Congratulations to all of this year’s Swift Student Challenge winners. I’m always impressed with the projects we’ve learned about through Apple’s press releases and past interviews we’ve done on AppStories. It’s always a pleasure to watch a new generation of kids learn to code and become the developers whose apps I know we’ll cover in coming years on MacStories.


Post-Chat UI

Fascinating analysis by Allen Pike on how, beyond traditional chatbot interactions, the technology behind LLMs can be used in other types of user interfaces and interactions:

While chat is powerful, for most products chatting with the underlying LLM should be more of a debug interface – a fallback mode – and not the primary UX.

So, how is AI making our software more useful, if not via chat? Let’s do a tour.

There are plenty of useful, practical examples in the story showing how natural language understanding and processing can be embedded in different features of modern apps. My favorite example is search, as Pike writes:

Another UI convention being reinvented is the search field.

It used to be that finding your flight details in your email required typing something exact, like “air canada confirmation”, and hoping that’s actually the phrasing in the email you’re thinking of.

Now, you should be able to type “what are the flight details for the offsite?” and find what you want.

Having used Shortwave and its AI-powered search for the past few months, I couldn’t agree more. The moment you get used to searching without exact queries or specific operators, there’s no going back.

Experience this once, and products with an old-school text-match search field feel broken. You should be able to just find “tax receipts from registered charities” in your email app, “the file where the login UI is defined” in your IDE, and “my upcoming vacations” in your calendar.

Interestingly, Pike mentions Command-K bars as another interface pattern that can benefit from LLM-infused interactions. I knew that sounded familiar – I covered the topic in mid-November 2022, and I still think it’s a shame that Apple hasn’t natively implemented these anywhere in their apps, especially now that commands can be fuzzier (just consider what Raycast is doing). Funnily enough, that post was published just two weeks before the public debut of ChatGPT on November 30, 2022. That feels like forever ago now.

Permalink

A Peek Into LookUp’s Word of the Day Art and Why It Could Never Be AI-Generated

Yesterday, Vidit Bhargava, developer of the award-winning dictionary app LookUp, wrote on his blog about the way he hand-makes each piece of artwork that accompanies the app’s Word of the Day. While revealing that he has employed this practice every day for an astonishing 10 years, Vidit talked about how each image is made from scratch as an illustration or using photography that he shoots specifically for the design:

Each Word of the Day has been illustrated with care, crafting digital illustrations, picking the right typography that conveys the right emotion.

Some words contain images, these images are painstakingly shot, edited and crafted into a Word of the Day graphic by me.

I’ve noticed before that each Word of the Day image in LookUp seemed unique, but I assumed Vidit was using stock imagery and illustrations as a starting point each time. The revelation that he is creating almost all of these from scratch every single day was incredible and gave me a whole new level of respect for the developer.

The idea of AI-generated art (specifically art that is wholly generated from scratch by LLMs) is something that really sticks in my throat – never more so than with the recent rip-off of the beautiful, hand-drawn Studio Ghibli films by OpenAI. Conversely, Vidit’s work shows passion and originality.

To quote Vidit, “Real art takes time, effort and perseverance. The process is what makes it valuable.”

You can read the full blog post here.


Is Electron Really That Bad?

I’ve been thinking about this video by Theo Browne for the past few days, especially in the aftermath of my story about working on the iPad and realizing its best apps are actually web apps.

I think Theo did a great job contextualizing the history of Electron and how we got to this point where the majority of desktop apps are built with it. There are two sections of the video that stood out to me and I want to highlight here. First, this observation – which I strongly agree with – regarding the desktop apps we ended up having thanks to Electron and why we often consider them “buggy”:

There wouldn’t be a ChatGPT desktop app if we didn’t have something like Electron. There wouldn’t be a good Spotify player if we didn’t have something like Electron. There wouldn’t be all of these awesome things we use every day. All these apps… Notion could never have existed without Electron. VS Code and now Cursor could never have existed without Electron. Discord absolutely could never have existed without Electron.

All of these apps are able to exist and be multi-platform and ship and theoretically build greater and greater software as a result of using this technology. That has resulted in some painful side effects, like the companies growing way faster than expected because they can be adopted so easily. So they hire a bunch of engineers who don’t know what they’re doing, and the software falls apart. But if they had somehow magically found a way to do that natively, it would have happened the same exact way.

This has nothing to do with Electron causing the software to be bad and everything to do with the software being so successful that the companies hire too aggressively and then kill their own software in the process.

The second section of the video I want to call out is the part where Theo links to an old thread from the developer of BoltAI, a native SwiftUI app for Mac that went through multiple updates – and a lot of work on the developer’s part – to ensure the app wouldn’t hit 100% CPU usage when simply loading a conversation with ChatGPT. As documented in the thread from late 2023, this is a common issue for the majority of AI clients built with SwiftUI, which is often less efficient than Electron when it comes to rendering real-time chat messages. Ironic.

Theo argues:

You guys need to understand something. You are not better at rendering text than the Chromium team is. These people have spent decades making the world’s fastest method for rendering documents across platforms because the goal was to make Chrome as fast as possible regardless of what machine you’re using it on. Electron is cool because we can build on top of all of the efforts that they put in to make Electron and specifically to make Chromium as effective as it is. The results are effective.

The fact that you can swap out the native layer with SwiftUI with even just a web view, which is like Electron but worse, and the performance is this much better, is hilarious. Also notice there’s a couple more Electron apps he has open here, including Spotify, which is only using less than 3% of his CPU. Electron apps don’t have to be slow. In fact, a lot of the time, a well-written Electron app is actually going to perform better than an equivalently well-written native app because you don’t get to build rendering as effectively as Google does.

Even if you think you made up your mind about Electron years ago, I suggest watching the entire video and considering whether this crusade against more accessible, more frequently updated (and often more performant) desktop software still makes sense in 2025.

Permalink

WWDC 2025 Scheduled for June 9-13 Along with Special Event at Apple Park

Source: Apple.

Source: Apple.

WWDC25 will be held from June 9 - 13 this year and include an in-person experience on June 9 that will provide developers the opportunity to watch the keynote at Apple Park, meet with Apple team members, and take part in special activities. Space will be limited, and details on how to apply to attend can be found on the Apple Developer site and app.

Apple has announced that WWDC 2025 will primarily take place online again this year from June 9-13 2025. However, the company said that it simultaneously will hold a corresponding limited in-person event at Apple Park for developers, students, and press like last year.

In a press release issued by today, Susan Prescott, Apple’s Vice President of Worldwide Developer Relations and Enterprise and Education Marketing, said:

We’re excited to mark another incredible year of WWDC with our global developer community. We can’t wait to share the latest tools and technologies that will empower developers and help them continue to innovate.

Apple also had this to say about events that will be held at Apple Park during the conference:

To celebrate the start of WWDC, Apple will also host an in-person experience on June 9 that will provide developers with the opportunity to watch the Keynote and Platforms State of the Union at Apple Park, meet with Apple experts one-on-one and in group labs, and take part in special activities. Space will be limited; details on how to apply to attend can be found on the WWDC25 website.

As time passes, fewer of the people I used to count on seeing at WWDC attend. I suppose that’s to be expected now that the event is primarily online. However, I’m just as excited as ever for this year’s event. It’s a chance to preview new technology and meet many of the developers whose work we cover. However, with rumors of new hardware on the horizon and a design refresh for all of Apple’s OSes, I’m sure this year’s WWDC will be as interesting as always.

Of course, MacStories readers can expect the same kind of comprehensive WWDC coverage we do every year. We’ll have extensive coverage on MacStories, AppStories, and MacStories Unwind, which will extend to Club MacStories too.