Tech
Podcast: Want a job? The AI will see you now
Published
4 years agoon
By
Terry Power
In the past, hiring decisions were made by people. Today, some key decisions that lead to whether someone gets a job or not are made by algorithms. The use of AI-based job interviews has increased since the pandemic. As demand increases, so too do questions about whether these algorithms make fair and unbiased hiring decisions, or find the most qualified applicant. In this second episode of a four-part series on AI in hiring, we meet some of the big players making this technology including the CEOs of HireVue and myInterview—and we test some of these tools ourselves.
We Meet:
- Kevin Parker, Chairman & CEO, HireVue
- Shelton Banks, CEO, re:work
- Mark Adams, Vice President of North America, Curious Thing AI
- Benjamin Gillman, Co-Founder and CEO, myInterview
- Fred Oswald, Psychology Professor, Rice University
- Suresh Venkatasubramanian, Computer Science Professor, Brown University; Asst. Director, White House Office of Science and Technology Policy
- Clayton Donnelly, industrial-organizational psychologist, myInterview
We Talked To:
- Kevin Parker, Chairman & CEO, HireVue
- Lindsey Zuloaga, Chief Data Scientist, HireVue
- Nathan Mondragon, Chief IO Psychologist, HireVue
- Shelton Banks, CEO, re:work
- Lisa Feldman Barrett, Psychology Professor, Northeastern University
- Cathy O’Neil, CEO, O’Neil Risk Consulting & Algorithmic Auditing
- Mark Adams, Vice President of North America, Curious Thing AI
- Han Xu, Co-founder & CTO, Curious Thing AI
- Benjamin Gillman, Co-founder & CEO, myInterview
- Fred Oswald, Psychology Professor, Rice University
- Suresh Venkatasubramanian, Computer Science Professor, Brown University; Asst. Director, White House Office of Science and Technology Policy
- Clayton Donnelly, industrial-organizational psychologist, myInterview
- Mark Gray, Director of People, Proper
- Christoph Hohenberger, Co-founder and Managing Director, Retorio
- Derek Mracek, Lead Data Scientist, Yobs
- Raphael Danilo, Co-founder & CEO, Yobs
- Jonathan Kestenbaum, Co-founder & Managing Director of Talent Tech Labs
- Josh Bersin, Global Industry Analyst
- Students and Teachers from the Hope Program in Brooklyn, NY
- Henry Claypool, policy expert and former Director of the U.S. Health and Human Services Office on Disability
Sounds From:
Credits:
This miniseries on hiring was reported by Hilke Schellmann and produced by Jennifer Strong, Emma Cillekens, Karen Hao and Anthony Green with special thanks to James Wall. We’re edited by Michael Reilly. Art direction by Stephanie Arnett.
Transcript:
[TR ID]
[Upsot: Working 9 to 5]
Jennifer: Work… is a big part of our lives. It’s how most of us pay our bills, feed our families… and put a roof over our heads.
Michelle Rogers: “A permanent job would mean stability. You need something to keep you going and to keep you fresh.”
Dora Lespier: “Like being able to take my daughter being able to get whatever she needs. It would be amazing.”
Henry Claypool: “You know, it’s, it’s a big part of my identity. It’s what I do a lot. And I enjoy trying to make the world a better place through my work.”
[Upsot.. chorus.. “working 9 to 5”]
Jennifer: In the past we left hiring decisions… with people. These days some of those key decisions that lead to whether someone gets a job or not are made by algorithms… which at least in theory could be more objective than humans.
South Korean Arirang News: Anchor “Technology is transforming the job market as we know it”
CBS News | Businesses turning to AI for job interviews, Anchor: “So…If you’re on the hunt for a new job, there’s a new twist in the hiring process: artificial intelligence.”
CBS Philly | Companies Using Artificial Intelligence To Hire Employees -Anchor 1: Well, what if the next time you apply for a job you were interviewed by a computer?
Anchor 2: Well, it’s possible. Some companies are now using artificial intelligence to help hire employees.
Jennifer: I’m Jennifer Strong and in this second episode of our series on AI and hiring/we look into the rise of AI in job interviews… Just like software we heard about last episode that decides whether a resume reaches a human… this software helps decide which interviews reach a hiring manager…and it’s completely changing how the interviewing process works.
Gilman: And one of the things candidates have to do now is, is tons of assignments, psychometric tests, heaps of interviews, and it’s bloody frustrating. And to be honest, it’s pretty black boxy. You don’t know what results you’re getting. You don’t know how people are viewing you… all of these types of things.
Jennifer: Machines are scoring people on the words they use, their tone of voice—sometimes even their facial expressions. We decided to test some of these tools ourselves… and we found rather unexpected things… like that some of the interviewing systems we tried? Don’t necessarily consider people’s answers to the interview questions.
Hilke: So, you’re saying it didn’t take the transcript at all into consideration. Just the intonation of my voice. And for some reason I scored a 73 percent match with the role.
Clayton: Yeah. Well done [laughs]
Jennifer: Some of these tools didn’t even consider whether the interview questions were answered in the correct language…
[Woman speaking in Mandarin]
Fred: Wow. That’s even more shocking. I would argue, you know, at least with German, maybe there are some cognates that look somewhat similar, but for Mandarin, I can’t imagine how that could be reliable, let alone lead to a high score.
Jennifer: The question is whether these are challenges we can overcome… or whether it’s a sign of a deeper problem.
Suresh: Should we be making better AI systems for hiring, or should we be trying to essentially bring down the entire enterprise?”
[TITLES]
Jennifer: Remote job interviews without a human on the other end got a huge boost during the pandemic. In these one-way interviews, every person applying for a position has to answer the same pre-recorded questions. And applicants record their answers on their own device. By far the largest player in this space is a company called HireVue. Its customers include more than a third of the Fortune 100… and it’s used by brands like Unilever, JP Morgan Chase, Delta Air Lines and Target.
Kevin: My Name’s Kevin Parker… and I’m the CEO at HireVue based in Salt Lake City.. Utah
Kevin: HigherVue is a 15, 16 year old company whose primary focus is democratizing hiring. We do that today in over 40 languages and over 180 countries around the world. And the primary way we do that is through, on demand interviews that candidates can interview for jobs any hour of the day, any day of the week, uh, We interviewed nearly 6 million people for jobs last year for our customers. About half of those were for hourly workers and about half of those were professional.
Jennifer: HireVue’s clients are not looking to fill a couple of open positions… but rather, they’re often trying to interview thousands of people at once. When I spoke to Parker earlier this year he gave an example of a customer who was interviewing 50-thousand people for jobs in 15-hundred locations… over a weekend.
And he believes the way his company does this… is more fair than the way most humans conduct job interviews.
Kevin: Structured interviewing is the most important way to hire, ask every candidate the same question, ask it the same way, make sure it’s related to work and the skills that they have… and so it’s the ability to deliver structured interviewing at scale that really matters. And we can do that with, uh, with video. And so people can record questions that could be reused multiple times.
Jennifer: to kinda peel that back again – it makes it sound like it’s just a video recording of a person asking a question, but there’s, there’s more to it than that. Your company is also processing what’s happening on the other end.
Kevin: What we’re really looking at there is the words the candidate is using to describe their team orientation or their ability to work independent or their problem solving skills. Uh, so we, we can assess individual competencies for candidates, and we can use those algorithms to understand the answers that they have, that they’re, that they’re giving.
Jennifer: HireVue’s algorithms are trained on top, middle and low performers and are looking for the differences between them. The algorithms then compare new video interviews of job applicants against that data.
Unlike some other vendors, the company’s AI does analyze the actual content of what people say… unless a client chooses not to use that feature. It also tries to examine other cues in their voices… and until recently, claimed it could find meaning in people’s facial expressions… which is incredibly controversial and often criticized.
The company says the small value the facial expression analysis added wasn’t worth the criticism it attracted… and so they’re phasing it out.
But despite the controversy… The use of HireVue’s tools continues to grow … and it’s being used in some unexpected ways.
Shelton: I’m using real live humans and that’s not dependable… so let’s give this a shot and see what it does.
Jennifer: Shelton Banks is the chief executive of Rework… a nonprofit, which gets free access to HireVue.
Shelton: We were skeptics like most people were of like, man, this isn’t gonna work. This is going to be biased. This is gonna be a danger here. And we’ve realized that there’s danger everywhere… there’s going to be bias everywhere.
Jennifer: His organization aims to help people from diverse backgrounds move up in the workforce and into better paying jobs.
Shelton: the demographic that I serve is 95% black and Brown, but we’ve been training candidates from what we like to call untapped and overlooked communities for the last five years and been helping people get jobs in the tech sales industry.
Jennifer: The training includes how to do a job interview – (with people or with AI). These days he also uses AI-based interviews to select people for the program, but when it first started he used volunteers to scrutinize applicants…
Shelton: Like we have all these volunteers that want to help and assist us. And so we put together a rubric of questions, behavioral questions that you typically get asked at an interview. And we say on a scale of one to five, how well did they do answering this question, give them feedback.
Jennifer: But the scores the volunteers gave were inconsistent… and often too good to be true.
Shelton: Like tell me about a time. you, you failed at something. And I had a response go: “I’ve never failed at anything.” Uh, I just like took the glasses off and said, uh, that was horrible like. (laugh) //Like, what do you mean? You never like you, you are unemployed right now. Like you have challenges, sir. Like, come on. Like, but interestingly enough, a volunteer would say like, okay, like, man, that’s awesome. Like, man, you, you you’re confident. But then I will put the same candidate in a, in a real interview. And then they wouldn’t get a job and be wondering why.
Jennifer: So some people who scored highly in their entrance interviews turned out to be more challenging to teach.
Shelton: We would invite these high scoring people into the program and then throughout the eight weeks, we would just run into roadblocks like man, you are rough around the edges. Like I, you, you..mmm. Like, I got you for 8 weeks. Like who, who let you, who gave you this gold star…// and then HireVue comes along and says, Hey, we got this AI tool. And, uh, I’m like, I don’t want it, like, what’s it going to do? And they’re like here it is, it, uh, it helps you, you know, give a person’s score, give a percentage of how well a person interviews.
Jennifer: The algorithm scores people based on different traits that job candidates probably need in a tech sales job.
Shelton: …And we said like, Hey, everybody take this HireVue assessment. They’re going to record you. We’re going to use AI. And of course the results were kind of all over the place. So I got people scoring 99%. I got people scoring 5%.
Jennifer: So, he ran a series of tests….… just to see what would happen.
Shelton: And so then the next cohort, I said, all right, well, I’m only going to take the people at the top tier, you know, that HireVue said were the best, we’re only going to take them and I’m going to train them. Cohort of 10 folks, I want to say seven of them got jobs. It was, like, difficult for me to get the seven people, jobs that HireVue said were the best. So now I’m going to do the exact opposite. I’m going to get 10 people that are at the bottom and see if I can help give them jobs. Worst cohort of my life. Eight weeks of people that they learned, they grew, but none of them got jobs.
Jennifer: He took a closer look at the results…. And he started noticing some patterns… like that HireVue gives better scores to people who sound convincing… regardless of what they actually say.
Shelton: They talked and the tone and pitch and pace was on point and, and understandable. But it was like they had no context. It was just kinda like, everything sounded great, but you know, didn’t answer the question necessarily. They just sounded great.
Jennifer: But what happened next surprised him.
Shelton: I would train people and put them in front of a real person. And it was just like, people started to get jobs.
Jennifer: In other words…HireVue seemed to be over-indexing on a candidate’s delivery over content…but so were real hiring managers.
Shelton: People started to get jobs left and right off just like that little piece, you know, it, it changed the way we recruit and changed the way we train.
Jennifer: These days… Banks uses HireVue to recruit people into the program and to train them to get jobs in the industry.
He doesn’t just take high scoring candidates, but instead has come up with his own way of doing things, with a mix of high performers, middle tier applicants, and then 10% coming from the bottom.
Shelton: You get this diversity of thought and diversity of experience in the cohort, which makes for, man, great cohorts.
Jennifer: He believes combining his judgement with AI… helps him make the best decisions about who to choose for his program.
Shelton: You take the HireVue. I see you’re in the bottom bucket and it’s like, man, I want to help you. This is going to be rough. But sometimes, you know, you know, like, uh, they prove HireVue wrong.// (breath) Shelton: You shouldn’t always listen to the tool. you shouldn’t always listen to the tool. But the tool will help you make an informed decision.
Jennifer: But are we really making informed decisions when we auto score job interviews?
Hilke Schellmann is our reporting partner on this series…
Hey Hilke.. You’ve been deep in researching these interviews.. Tell us what you found?
Hilke: Yeah. So we found that… video interviews are controversial. We don’t know yet how good AI is in auto-scoring the content of these interviews. Does a computer really “understand” our answers? Can it analyze the many ways humans talk about teamwork for example?
Video interviews are also controversial, because some of them use AI to analyze job applicants’ facial expressions and the tone of their voices to predict if someone will succeed in a given job. So this raises a few questions: How good is the software in analyzing the words we say? In reading the facial expressions on our faces and intonation in our voices? And also, what facial expressions or tone of voice does one need for a given job? Is this even relevant to a given job?
Jennifer: Yeah and scientists have repeatedly questioned how good these automated facial and emotion readers are… even… whether it’s something that’s possible to do at all… So, what exactly are these algorithms predicting… and can we see under the hood, so to speak?
Hilke: Sometimes yes… sometimes no… Very few people are actually getting access to see how these black box algorithms work…. and there’s a lot of concern about these tools in theory, but there isn’t a lot of insight into the tools themselves. That’s why we wanted to try to take a closer look.
Jennifer: So… Altogether, we looked at seven tools with up to five people test driving each one. Were there any that stood out to you?
Hilke: Yeah… one really did stand out to me. It’s called Curious Thing AI.. It’s an AI phone interview platform. I actually found out about this company at a recent HR Tech conference.
Mark: Hey everyone. Good morning. Welcome to our demo.
Jennifer: Mark Adams is its Vice President of North America.
Mark: …Curious Thing.. for those of you who don’t kno w us, we are a conversational AI voice interview solution, essentially, your candidates will do an interview with a voice AI.
Digital Interviewer: Welcome to the digital interviewer hosted by Curious Thing, AI. My name is Christine. Thank you very much for joining me today….
Mark: We work especially well when applied to high volume recruiting scenarios. When you’ve got to hire a lot of people in a short space of time, we can really streamline that process. We use a bunch of really interesting AI technologies like natural language processing, knowledge graph, deep transfilling, and we have clients right now here in the U S, in our home country of Australia, Philippines, New Zealand and Singapore.
Jennifer: Ok, so the AI is designed to help hiring managers pick out the right people for a job amongst hundreds or thousands of applicants.
Mark: So let’s look at this candidate here, uh, at the top and see what the AI has actually done in terms of the interview. The recording of it is here and I can play it back if I choose to, but generally that’s not the best use of the time because the whole point of it is to reduce your screening time and just focus on candidates you want. So you can listen to it but we don’t recommend that you do that.
Hilke: So, this was pretty telling to me. Usually a lot of these companies tell me that their AI is just one data point amongst many to make a hiring decision. And hiring managers should really watch or listen at least to some parts of the interviews… which in my mind defeats a little bit the purpose of using an automated tool to make hiring easier… but here Mark Adams spells it out to hiring managers: Do not spend time on the recording.
Jennifer: He also says accents don’t matter.
Mark: our AI is completely resilient to accents or stutters or any kind of, um, sort of verbal tick that, um, you know, might cause a human, a recruiter to think one way or another about the candidate. So the AI is actually listening to the spoken word, and then it’s being streamed into text in real time. And the analysis is actually just done on the text of the conversation.
Jennifer: So Hilke wanted to test it out…since her first language is German.
AI Interviewer: have some mumbled underneath) Please remember. I don’t think there are right or wrong answers here. Let’s start. Tell me about a tough work situation you have gone through. What did you do and what was the outcome?
Hilke: I once had a boss… [Jake: I would begin to fade tape here under me talking below] who was a micromanager…. and that was very hard to deal with because she would second guess everything.
Jennifer: So you got an expert level score – then you tested it again, speaking only in German… And what were you expecting to see happen there?
Hilke: I believe in stress testing these systems to understand how the scoring really works. And… I’ve talked to many vendors and they’ve told me that if someone has issues speaking or there was another problem with the detection of a person’s voice, the software would recognize the problem. HireVue for example told me that there is a minimum threshold that a candidate needs to meet for the system to score them.
Jennifer: Got it. So you thought you would just get an error message or something when you answered the questions in German instead of English?
Hilke: Exactly.
Jennifer: What actually ended up happening?
Hilke: So, it assessed me on me speaking German but gave me a competency score English score. So… I was scored six out of nine… and my skill level in English is competent.
Jennifer: That’s wild. So… you only spoke German, but the software said you were competent in English?
Hilke: Yeah…. I was confused too… So I redid the experiment.. Same result.
Hilke: So I did a similar experiment with myInterview, which we talked about earlier. It’s a video interview tool in which the algorithm analyzes the words I say and my tone of voice. It then rates how good of a fit I am for the role.
Jennifer: And just for some context – the companies we’re talking about here are much smaller than like say HireVue – but they aren’t tiny either. These tools are used by millions of people. This particular company, myInterview, was founded in Israel and also operates in the U-S, U-K and Australia.
Benjamin: So our customers are typically, um, small, medium enterprises. Diverse 4,000 companies use our platform from a myriad of different industries. We’ve got a very large candidate pool, um, over 3.4 million candidate interviews through the, through the site.
Jennifer: Benjamin Gillman is the company’s chief executive. He spoke to us back in April… and said they only need 30 seconds of audio to give an insight into a candidate’s personality. Plus.. he said the tool works with different accents.
Benjamin: The error is, is quite negligible. The insights we’re giving could be a 0.2% change, maybe in the assumption that this person is outgoing. Because were overlying overlaying tone on top of text. We’re able to mitigate a lot of that because tone portrays a lot where sometimes a language is deficient.
Jennifer: It seems almost magical to pull full-blown personality profiles out of 30 seconds of audio and text, but Gillman says his team is working to keep their tools from being black boxes.
Benjamin: Our goal is to be very transparent in this and to really communicate exactly what’s happening and how it’s happening and, you know, how the machines are working. We aren’t looking to say, this person is the right hire. All we’re trying to do is help with search. It’s not a system that you can game. It’s not a, uh, it’s not something that’s going to discriminate against you.
Jennifer: Right. So Hilke you also tried speaking German to myInterview… and you got a score there too.
Hilke: I did get a score, but first I got a transcript of what I said.
Jennifer: Ok, and this is what it interpreted your words to be… so I’m just going to read from this transcript here….. Which doesn’t exactly make a lot of sense
So humidity is desk a beat-up.
Sociology, does it iron?
Mined material nematode adapt.
Secure location, mesons the first half gamma their Fortunes in
Hilke: Yup, apparently that’s part of the answer I gave to the first question where I had to tell the machine about myself. And… as you heard…. It’s gibberish.
Jennifer: Ok what results did you get from this?
Hilke: I was scored a 73 percent match for the role although I didn’t speak a word of English and the things I said in German had nothing to do with the questions I was asked or with the job itself.
Jennifer: .. because you were reading German off a Wikipedia page.
Hilke: Yeah… actually 73% is pretty high. And I wanted to make sure this language discovery was not just based on me speaking German…. so… one of the graduate students who I work with was kind enough to record herself in Chinese reading the same Wikipedia text – she scored 80% – And her English transcript is gibberish just like mine.
Jennifer: Right… bringing us back once again to this question of… Are these machines making decisions based on scientific evidence or are they just guessing?
Hilke: Yes… That’s the elephant in the room.
Jennifer: Alright so we’ve reached back out to these companies and we’ll report anything we learn later in this episode… for his part, the company’s CEO told us back in April that he’s receiving very good feedback on this product from customers.
Benjamin: We see that they are hiring people that they might not have considered previously, and that are you know very good fits for their companies and it’s hopefully, uh, uh, less painful and more informed process.
Jennifer: All this makes me really wonder what would happen if researchers ran a lot of tests and scenarios over a longer period of time… what might they find?
Hilke: I would love to see their results.
[quick music beat]
Jennifer: We’ll be back … in a moment
[Midroll]
[quick music transition]
Jennifer: To some… the results we found might not be not surprising. Many have called for a ban on AI in hiring…saying the flaws in these tools, (that millions are using to try and land a job), are just not redeemable…
Other experts say such a decision would be hasty and uninformed…sure, there are kinks — but the promise of conducting fairer interviews at scale is too great to let go.
So… we called up an expert ….
Suresh: My name is Suresh Venkatasubramanian.
Jennifer: He’s a computer scientist and a professor at Brown University.
Suresh: Should we be making better AI systems for hiring, or should we be trying to essentially bring down the entire enterprise? It’s this tension I think between, I think what people have called the abolitionist and the incrementalist viewpoint is sort of at the heart of literally every day when I think about these things.
Jennifer: He’s also an AI ethicist who used to serve on HireVue’s Advisory board.
Suresh: And there’s no single answer to these questions, I think it depends on the circumstances. So I think the truth is that for a while. I thought that it made sense to try and change things from the inside, and at some point when you feel that maybe that’s not going to happen, then you don’t feel like you’re being effective anymore. And then I felt, okay, it was time to maybe not be there.
Jennifer: He was at HireVue still selling facial expression analysis to customers. He has doubts these technologies are backed by solid science. And he shared his concerns with the company.
Suresh: Initially the reaction was, okay, Let’s not, you know, immediately stop doing this, but let’s look into this more. Let’s be careful. Okay, fine. Let’s do that. So then you wait and then it comes up again and then you see it again. And at some point you realize that they probably weren’t going to stop doing it. And it didn’t matter what I said.
Jennifer: When HireVue announced it was phasing out the use of this tech… in response, he tweeted,: “It’s about time. I used to work with HireVue on their issues around bias and eventually quit over their resistance to dropping video analysis.”
Jennifer: And.. he says companies often hide behind claims that their AI is accurate…but that isn’t the full picture.
Suresh: Saying that your system is accurate, merely means that your system matches what the training data says it should do. And so then there’s also the question of, well, is the training data accurate, which leads you down rabbit holes that often people don’t want to go down.
Jennifer: In other words, he says he got more and more convinced that some vendors in the industry don’t really want to know what their AI is predicting upon.
Suresh: They can put all the guardrails around it, but they have to sell a product. And if your marketing pitch involves a certain tool, an AI tool, a scalable tool, that’s the thing you’re selling. You’re not going to stop selling it. The question is always, if at some point someone says, you shouldn’t do this. Are you going to stop doing it or not? And if you’re not, then there’s no longer a place for the person who says you should not be doing this.
Jennifer: It made him question whether self-regulation is even possible for many companies… and what it is he’d like to see.
Suresh: I would like to see the industry to be more honest, I think, and reflective about what is it they’re trying to solve here. I honestly think that all of this is really about scale, which itself is not a bad thing but there are consequences of going with systems that scale like that.
Suresh: The underlying assumptions about what is valued and what is not are the key here. And so I’m often a little skeptical when I hear well, we want to build us to be more fair. Sure. But within the context of scale, being your primary goal.
Jennifer: After we taped this interview in February, Venkatasubramanian (VEN-Ka-SuE-BRa- MAN-EE-An) accepted a position in the Biden administration, with the White House Office of Science & Technology Policy.
[Music]
Jennifer: Before we go… we want you to know we told the companies behind the systems we tested about what we did… and about our results. Curious Thing AI thanked us for testing their system… myInterview did too… and they also got on the phone with us to talk about what we found.
Clayton: It knows that you weren’t speaking English for sure.
Jennifer: Clayton Donelly is myInterview’s industrial-organizational psychologist.
Clayton: Um, but then it defaults to audio only then it won’t use your content because the content you can see, it’s like, um, it’s random, it’s random nonsense. So it won’t, it won’t read that at all, unless we tell it to do that.
Hilke: So, you were saying it didn’t take the transcript at all into consideration. Just the intonation of my voice. And for some reason I scored a 73% match. Like my, the intonation of my voice was a 73% match with the role.
Clayton: Yeah. Well done [laughs]
Jennifer: Ok, so he says the system “understood” that you weren’t speaking English…
Hilke: But if the system scored me on the intonation of my voice alone, I still don’t understand how it could pull a full personality analysis on me, showing for example that I am way more innovative than consistent. How do you find that in the sound of my voice?
Jennifer: So, we shared these findings with psychology professor Fred Oswald at Rice University, who does research in artificial intelligence and hiring.
Fred: If information is being pulled from a video interview it’s definitely important, I would say a responsibility, to understand whether what is being measured is job relevant – what is being measured is understood fairly across applicants, no matter what your background is.
Jennifer: And we played him our tests in German… and Mandarin.
Fred: Wow. That’s even more shocking. I would argue, you know, at least with German, maybe there are some cognates that look somewhat similar, but for Mandarin, I can’t imagine that the score is… at least the way you showed the text, how that could be reliable, let alone lead to a high score.
Jennifer: Basically… he doesn’t think some of these tools should be used to make high-stakes hiring decisions.
Fred: Intonation… I, I don’t, I don’t see much research evidence, um, maybe more work needs to be done. Right? What are the situations where intonation would provide any job relevant information? I think the argument currently has to be that we, we really can’t use intonation as data for hiring that just doesn’t seem fair or reliable or valid.
Jennifer: So, he wants scientists to take a closer look at these tools.
Fred: We want to encourage innovation, but research needs to continue to catch up and gather data on how reliable are these scores? Like what evidence tells you that whatever is being measured reliably and fairly is actually relevant to an employee’s success in the organization, which is good for the organization. That that’s why they’re presumably paying for these tests, but also good for the, for the job applicant, because you want to be doing the right thing and getting rewarded for your performance.
[Music tail]
Jennifer: Next episode…. we turn our attention to A-I games that are used to evaluate potential employees…
Anonymous Jobseeker: Not everyone thinks the same. So how are you inputting that diversity and inclusion when you’re only selecting the people that can figure out a puzzle within 60 seconds?
Jennifer: We present that criticism to someone who designs these games.
Matthew Neale: You know, the disconnect, I think for this candidate was between what the assessment was getting the candidate to do and… this is why it’s relevant… and this is why we’re using it in this particular job.
Jennifer: AND… in part three of this series, we’ll also take a closer look at why the uses of AI in hiring aren’t really regulated.
[Credits]
Jennifer: This miniseries on hiring was reported by Hilke Schellmann and produced by me, Emma Cillekens, Karen Hao and Anthony Green with special thanks to James Wall. We’re edited by Michael Reilly.
Thanks for listening… I’m Jennifer Strong.
You may like
-
Job title of the future: metaverse lawyer
-
AI might not steal your job, but it could change it
-
Could ChatGPT do my job?
-
Why we need to do a better job of measuring AI’s carbon footprint
-
Men’s Journal Everyday Warrior Podcast Episode 33: Coach J. B. Bickerstaff
-
Men’s Journal Everyday Warrior Podcast Episode 32: Ezekiel Mitchell
My senior spring in high school, I decided to defer my MIT enrollment by a year. I had always planned to take a gap year, but after receiving the silver tube in the mail and seeing all my college-bound friends plan out their classes and dorm decor, I got cold feet. Every time I mentioned my plans, I was met with questions like “But what about school?” and “MIT is cool with this?”
Yeah. MIT totally is. Postponing your MIT start date is as simple as clicking a checkbox.
Now, having finished my first year of classes, I’m really grateful that I stuck with my decision to delay MIT, as I realized that having a full year of unstructured time is a gift. I could let my creative juices run. Pick up hobbies for fun. Do cool things like work at an AI startup and teach myself how to create latte art. My favorite part of the year, however, was backpacking across Europe. I traveled through Austria, Slovakia, Russia, Spain, France, the UK, Greece, Italy, Germany, Poland, Romania, and Hungary.
Moreover, despite my fear that I’d be losing a valuable year, traveling turned out to be the most productive thing I could have done with my time. I got to explore different cultures, meet new people from all over the world, and gain unique perspectives that I couldn’t have gotten otherwise. My travels throughout Europe allowed me to leave my comfort zone and expand my understanding of the greater human experience.
“In Iceland there’s less focus on hustle culture, and this relaxed approach to work-life balance ends up fostering creativity. This was a wild revelation to a bunch of MIT students.”
When I became a full-time student last fall, I realized that StartLabs, the premier undergraduate entrepreneurship club on campus, gives MIT undergrads a similar opportunity to expand their horizons and experience new things. I immediately signed up. At StartLabs, we host fireside chats and ideathons throughout the year. But our flagship event is our annual TechTrek over spring break. In previous years, StartLabs has gone on TechTrek trips to Germany, Switzerland, and Israel. On these fully funded trips, StartLabs members have visited and collaborated with industry leaders, incubators, startups, and academic institutions. They take these treks both to connect with the global startup sphere and to build closer relationships within the club itself.
Most important, however, the process of organizing the TechTrek is itself an expedited introduction to entrepreneurship. The trip is entirely planned by StartLabs members; we figure out travel logistics, find sponsors, and then discover ways to optimize our funding.
In organizing this year’s trip to Iceland, we had to learn how to delegate roles to all the planners and how to maintain morale when making this trip a reality seemed to be an impossible task. We woke up extra early to take 6 a.m. calls with Icelandic founders and sponsors. We came up with options for different levels of sponsorship, used pattern recognition to deduce the email addresses of hundreds of potential contacts at organizations we wanted to visit, and all got scrappy with utilizing our LinkedIn connections.
And as any good entrepreneur must, we had to learn how to be lean and maximize our resources. To stretch our food budget, we planned all our incubator and company visits around lunchtime in hopes of getting fed, played human Tetris as we fit 16 people into a six-person Airbnb, and emailed grocery stores to get their nearly expired foods for a discount. We even made a deal with the local bus company to give us free tickets in exchange for a story post on our Instagram account.
Tech
The Download: spying keyboard software, and why boring AI is best
Published
1 year agoon
22 August 2023By
Terry Power
This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology.
How ubiquitous keyboard software puts hundreds of millions of Chinese users at risk
For millions of Chinese people, the first software they download onto devices is always the same: a keyboard app. Yet few of them are aware that it may make everything they type vulnerable to spying eyes.
QWERTY keyboards are inefficient as many Chinese characters share the same latinized spelling. As a result, many switch to smart, localized keyboard apps to save time and frustration. Today, over 800 million Chinese people use third-party keyboard apps on their PCs, laptops, and mobile phones.
But a recent report by the Citizen Lab, a University of Toronto–affiliated research group, revealed that Sogou, one of the most popular Chinese keyboard apps, had a massive security loophole. Read the full story.
—Zeyi Yang
Why we should all be rooting for boring AI
Earlier this month, the US Department of Defense announced it is setting up a Generative AI Task Force, aimed at “analyzing and integrating” AI tools such as large language models across the department. It hopes they could improve intelligence and operational planning.
But those might not be the right use cases, writes our senior AI reporter Melissa Heikkila. Generative AI tools, such as language models, are glitchy and unpredictable, and they make things up. They also have massive security vulnerabilities, privacy problems, and deeply ingrained biases.
Applying these technologies in high-stakes settings could lead to deadly accidents where it’s unclear who or what should be held responsible, or even why the problem occurred. The DoD’s best bet is to apply generative AI to more mundane things like Excel, email, or word processing. Read the full story.
This story is from The Algorithm, Melissa’s weekly newsletter giving you the inside track on all things AI. Sign up to receive it in your inbox every Monday.
The ice cores that will let us look 1.5 million years into the past
To better understand the role atmospheric carbon dioxide plays in Earth’s climate cycles, scientists have long turned to ice cores drilled in Antarctica, where snow layers accumulate and compact over hundreds of thousands of years, trapping samples of ancient air in a lattice of bubbles that serve as tiny time capsules.
By analyzing those cores, scientists can connect greenhouse-gas concentrations with temperatures going back 800,000 years. Now, a new European-led initiative hopes to eventually retrieve the oldest core yet, dating back 1.5 million years. But that impressive feat is still only the first step. Once they’ve done that, they’ll have to figure out how they’re going to extract the air from the ice. Read the full story.
—Christian Elliott
This story is from the latest edition of our print magazine, set to go live tomorrow. Subscribe today for as low as $8/month to ensure you receive full access to the new Ethics issue and in-depth stories on experimental drugs, AI assisted warfare, microfinance, and more.
The must-reads
I’ve combed the internet to find you today’s most fun/important/scary/fascinating stories about technology.
1 How AI got dragged into the culture wars
Fears about ‘woke’ AI fundamentally misunderstand how it works. Yet they’re gaining traction. (The Guardian)
+ Why it’s impossible to build an unbiased AI language model. (MIT Technology Review)
2 Researchers are racing to understand a new coronavirus variant
It’s unlikely to be cause for concern, but it shows this virus still has plenty of tricks up its sleeve. (Nature)
+ Covid hasn’t entirely gone away—here’s where we stand. (MIT Technology Review)
+ Why we can’t afford to stop monitoring it. (Ars Technica)
3 How Hilary became such a monster storm
Much of it is down to unusually hot sea surface temperatures. (Wired $)
+ The era of simultaneous climate disasters is here to stay. (Axios)
+ People are donning cooling vests so they can work through the heat. (Wired $)
4 Brain privacy is set to become important
Scientists are getting better at decoding our brain data. It’s surely only a matter of time before others want a peek. (The Atlantic $)
+ How your brain data could be used against you. (MIT Technology Review)
5 How Nvidia built such a big competitive advantage in AI chips
Today it accounts for 70% of all AI chip sales—and an even greater share for training generative models. (NYT $)
+ The chips it’s selling to China are less effective due to US export controls. (Ars Technica)
+ These simple design rules could turn the chip industry on its head. (MIT Technology Review)
6 Inside the complex world of dissociative identity disorder on TikTok
Reducing stigma is great, but doctors fear people are self-diagnosing or even imitating the disorder. (The Verge)
7 What TikTok might have to give up to keep operating in the US
This shows just how hollow the authorities’ purported data-collection concerns really are. (Forbes)
8 Soldiers in Ukraine are playing World of Tanks on their phones
It’s eerily similar to the war they are themselves fighting, but they say it helps them to dissociate from the horror. (NYT $)
9 Conspiracy theorists are sharing mad ideas on what causes wildfires
But it’s all just a convoluted way to try to avoid having to tackle climate change. (Slate $)
10 Christie’s accidentally leaked the location of tons of valuable art
Seemingly thanks to the metadata that often automatically attaches to smartphone photos. (WP $)
Quote of the day
“Is it going to take people dying for something to move forward?”
—An anonymous air traffic controller warns that staffing shortages in their industry, plus other factors, are starting to threaten passenger safety, the New York Times reports.
The big story
Inside effective altruism, where the far future counts a lot more than the present
October 2022
Since its birth in the late 2000s, effective altruism has aimed to answer the question “How can those with means have the most impact on the world in a quantifiable way?”—and supplied methods for calculating the answer.
It’s no surprise that effective altruisms’ ideas have long faced criticism for reflecting white Western saviorism, alongside an avoidance of structural problems in favor of abstract math. And as believers pour even greater amounts of money into the movement’s increasingly sci-fi ideals, such charges are only intensifying. Read the full story.
—Rebecca Ackermann
We can still have nice things
A place for comfort, fun and distraction in these weird times. (Got any ideas? Drop me a line or tweet ’em at me.)
+ Watch Andrew Scott’s electrifying reading of the 1965 commencement address ‘Choose One of Five’ by Edith Sampson.
+ Here’s how Metallica makes sure its live performances ROCK. ($)
+ Cannot deal with this utterly ludicrous wooden vehicle.
+ Learn about a weird and wonderful new instrument called a harpejji.
Tech
Why we should all be rooting for boring AI
Published
1 year agoon
22 August 2023By
Terry Power
This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here.
I’m back from a wholesome week off picking blueberries in a forest. So this story we published last week about the messy ethics of AI in warfare is just the antidote, bringing my blood pressure right back up again.
Arthur Holland Michel does a great job looking at the complicated and nuanced ethical questions around warfare and the military’s increasing use of artificial-intelligence tools. There are myriad ways AI could fail catastrophically or be abused in conflict situations, and there don’t seem to be any real rules constraining it yet. Holland Michel’s story illustrates how little there is to hold people accountable when things go wrong.
Last year I wrote about how the war in Ukraine kick-started a new boom in business for defense AI startups. The latest hype cycle has only added to that, as companies—and now the military too—race to embed generative AI in products and services.
Earlier this month, the US Department of Defense announced it is setting up a Generative AI Task Force, aimed at “analyzing and integrating” AI tools such as large language models across the department.
The department sees tons of potential to “improve intelligence, operational planning, and administrative and business processes.”
But Holland Michel’s story highlights why the first two use cases might be a bad idea. Generative AI tools, such as language models, are glitchy and unpredictable, and they make things up. They also have massive security vulnerabilities, privacy problems, and deeply ingrained biases.
Applying these technologies in high-stakes settings could lead to deadly accidents where it’s unclear who or what should be held responsible, or even why the problem occurred. Everyone agrees that humans should make the final call, but that is made harder by technology that acts unpredictably, especially in fast-moving conflict situations.
Some worry that the people lowest on the hierarchy will pay the highest price when things go wrong: “In the event of an accident—regardless of whether the human was wrong, the computer was wrong, or they were wrong together—the person who made the ‘decision’ will absorb the blame and protect everyone else along the chain of command from the full impact of accountability,” Holland Michel writes.
The only ones who seem likely to face no consequences when AI fails in war are the companies supplying the technology.
It helps companies when the rules the US has set to govern AI in warfare are mere recommendations, not laws. That makes it really hard to hold anyone accountable. Even the AI Act, the EU’s sweeping upcoming regulation for high-risk AI systems, exempts military uses, which arguably are the highest-risk applications of them all.
While everyone is looking for exciting new uses for generative AI, I personally can’t wait for it to become boring.
Amid early signs that people are starting to lose interest in the technology, companies might find that these sorts of tools are better suited for mundane, low-risk applications than solving humanity’s biggest problems.
Applying AI in, for example, productivity software such as Excel, email, or word processing might not be the sexiest idea, but compared to warfare it’s a relatively low-stakes application, and simple enough to have the potential to actually work as advertised. It could help us do the tedious bits of our jobs faster and better.
Boring AI is unlikely to break as easily and, most important, won’t kill anyone. Hopefully, soon we’ll forget we’re interacting with AI at all. (It wasn’t that long ago when machine translation was an exciting new thing in AI. Now most people don’t even think about its role in powering Google Translate.)
That’s why I’m more confident that organizations like the DoD will find success applying generative AI in administrative and business processes.
Boring AI is not morally complex. It’s not magic. But it works.
Deeper Learning
AI isn’t great at decoding human emotions. So why are regulators targeting the tech?
Amid all the chatter about ChatGPT, artificial general intelligence, and the prospect of robots taking people’s jobs, regulators in the EU and the US have been ramping up warnings against AI and emotion recognition. Emotion recognition is the attempt to identify a person’s feelings or state of mind using AI analysis of video, facial images, or audio recordings.
But why is this a top concern? Western regulators are particularly concerned about China’s use of the technology, and its potential to enable social control. And there’s also evidence that it simply does not work properly. Tate Ryan-Mosley dissected the thorny questions around the technology in last week’s edition of The Technocrat, our weekly newsletter on tech policy.
Bits and Bytes
Meta is preparing to launch free code-generating software
A version of its new LLaMA 2 language model that is able to generate programming code will pose a stiff challenge to similar proprietary code-generating programs from rivals such as OpenAI, Microsoft, and Google. The open-source program is called Code Llama, and its launch is imminent, according to The Information. (The Information)
OpenAI is testing GPT-4 for content moderation
Using the language model to moderate online content could really help alleviate the mental toll content moderation takes on humans. OpenAI says it’s seen some promising first results, although the tech does not outperform highly trained humans. A lot of big, open questions remain, such as whether the tool can be attuned to different cultures and pick up context and nuance. (OpenAI)
Google is working on an AI assistant that offers life advice
The generative AI tools could function as a life coach, offering up ideas, planning instructions, and tutoring tips. (The New York Times)
Two tech luminaries have quit their jobs to build AI systems inspired by bees
Sakana, a new AI research lab, draws inspiration from the animal kingdom. Founded by two prominent industry researchers and former Googlers, the company plans to make multiple smaller AI models that work together, the idea being that a “swarm” of programs could be as powerful as a single large AI model. (Bloomberg)