Will real-time data processing replace batch processing?
At Confluent's user conference, Kafka co-creator Jay Kreps argued that stream processing would eventually supplant traditional methods of batch processing altogether.
Absolutely: Businesses operate in real-time and are looking to move their IT systems to real-time capabilities.
Eventually: Enterprises will adopt technology slowly, so batch processing will be around for several more years.
No way: Stream processing is a niche, and there will always be cases where batch processing is the only option.
AI / Large Language Models / Tech Life

What Developers Can Do to Make AI Ethical

The release of large language models has caught organizations off guard, and now many are playing catchup. Developers should be part of that.
Jun 30th, 2023 6:00am by
Featued image for: What Developers Can Do to Make AI Ethical
Image by Gerd Altmann from Pixabay

Organizations were caught off guard by large language models, and they’re playing catch up, according to responsible AI ethicists. Now that the proverbial cat is out of the bag, and pending outright bans, it’s time to get some footing on AI ethics.

“Why are we focusing on responsible AI so much today, it’s really not a new concern, right?” said Julia Stoyanovich, director at the Center for Responsible AI at New York University, during the Rev4 conference, a Domino Data Lab event held earlier this month. “The reason that we focus on this before is because AI has been so very successful, and very often surprisingly, so we can no longer shy away from this conversation.

Julia Stoyanovich, director NYU Center for Responsible AI

Photo by Loraine Lawson

Organizations are “throwing together ways of educating their employees across the enterprise about the dangers of LLM and how to use it responsibly,” said ethicist, author and philosophy professor Reid Blackman at Rev4. “Frankly, probably unless you completely shut down access to general AI in your organization, you should just at least start with the quick and dirty approach, the informal approach, on your way to a more full-blown version.”

But eventually, organizations will need to thoroughly address the ethics of AI — and not just looking at the creation of the model, but how generative AI and large language models are used within the organization.

AI for Not Bad

So far, much of the talk has focused on AI for good, but Blackman contended the focus should be AI for “not bad.” He means we should be prioritizing keeping AI from doing harm at scale, rather than focusing on a particular project that creates positive social impact. He calls it ethical risk mitigation.

“Maybe we’ll have ethical qualms about certain kinds of goals, but for the most part, we’ve got these perfectly ethically acceptable or ethically neutral serving certainly not ethically bad business goals,” he said. “We want to pursue them using the powerful tool that is AI. And we want to do it in a way that’s not ethically, reputationally, regulatory, or legally problematic. We don’t want it to go off the guardrails. So this is AI for not bad.”

AI for not bad is a sort of “first, do no harm” approach to AI from development to use, and it’s ability to do whatever it does at scale.

“From a business perspective, you need to prioritize not wronging people at scale or harming people at scale over a particular project that creates positive social impact,” Blackman said.

ethicist, author and philosophy professor Reid Blackman

Photo by Loraine Lawson

Structure and Content: Two Faces of AI Ethics

There are two sides of ethical risks to consider, he added. There’s the structure side of the house — how do we identify and mitigate risks in a systemic and comprehensive way. Then there’s the content side, which is sometimes addressed in a rushed, superficial way, he said.

“People think about the content side very superficially — we see headlines, biased AI, privacy violations. Black Box models are really scary — killing someone with a car — and then they run to a ‘What are we gonna do about it? Well, how do we fix it?’ and in my estimation, that’s too quick,” he said. “It’s too quick, because you can’t build a good structure, you can’t do well on the governance side of things. So structure is governance policies, processes, workflow, tools, training, upskilling, etc. You can’t build an effective AI ethical risk program within an organization unless you understand the content side more deeply.”

Companies need to think of these things more deeply if they want to create an effective risk mitigation strategy, he added. That means looking across three areas: Explainable AI, blackbox AI, and privacy violations.

“Why are these three an essential part of the landscape?” Blackman said. “The reason is that these three are very likely just given how ML works.”

Machine learning is just software learning through data and recognizing patterns in the data, he explained. Those patterns can be so phenomenally complex that they can be tracking the mathematical relations of 1000s of variables and the 1000s of mathematical relations among those variables, that we can’t understand that pattern, he added. This creates the black box problem.

Developers have a responsibility to carefully consider the data that’s being used to train their AI systems, warned Gaurav Kachhawa, CPO of marketing chatbot platform Gupshup.

“Developers must carefully consider the data that’s being used to train their AI systems because if the data is biased, it may creep into the system as well,” Kachhawa told The New Stack via email. “They should provide clear documentation about how the AI system works and have a plan for dealing with issues that may arise such as bias or discrimination.”

Explainable AI is often touted as the solution to the black box problem. It’s the idea that AI could explain how it arrived at a conclusion, including citing sources and explaining how it’s model might have contributed to the information. Ironically, some people have asked for citations from models that aren’t trained in explainable AI and what they received were hallucinated citations, Blackman said.

Privacy violations can relate to how the data is collected and what data is used, and that’s where developers can help, Kachhawa said.

Keep Humans in the Loop

“User consent is necessary before collecting any data,” Kachhawa told The New Stack. “Users should be clearly informed about what data is being collected, how it will be used, and who will have access to it.”

Data encryption, limiting the amount of data that’s collected and allowing appropriate user controls can help protect user privacy and prevent data from being used for unauthorized purposes, he added.

“The privacy policies should be clear with user’s control on the data,” advised Kachhawa. “AI systems are not perfect, and they can make mistakes.”

But with AI, privacy violations can happen even if those steps are above board, ethically, Blackman warned. For instance, AI might have been given access and allowed to use location data but if that data eventually reveals personal information — such as every Monday, an end user goes to a cancer center and given other data, the AI concludes that person has cancer. If it’s right, that could be considered a violation of the end user’s privacy, Blackman said.

“A lot of models doing that are making predictions about people and in some cases, while the data, the training data and possession of that does not constitute a violation of privacy, the inferred data constitutes a violation of privacy,” he said.

That’s why it’s important for developers to ensure there is human oversight in place, Kachhawa said.

“Developers should not rely solely on AI systems to make decisions, and they should always have human oversight in place to ensure that AI systems are used in a safe and ethical manner,” he said. “AI applications have to be used in a way that’s consistent with human values. It should not be used to derive individual conclusions. For example, it should not be used to make decisions about who gets a loan, who gets a job, or who is admitted to school.”

There should also be a feedback mechanism so that when needed, users are able to correct or override the AI application’s decisions if they believe that the AI application is making a mistake, he added.

“In all of this, what is most important is getting informed consent from users. Users should understand the purpose of the AI application, how their data will be used, and any potential risks or benefits, along with an opportunity to opt out,” Kachhawa cautioned.

Engineers Make the World They Want

Ultimately, though, algorithms are a tool that we wield — and we’ll create the world that results from its use, said Stoyanovich.

“Algorithms are probably the most sophisticated tools that people have had at their disposal since the beginning of human history,” Stoyanovich said. “We rejoice that they make life easier for us, but fear that they will enslave us, slightly paraphrasing from the book, algorithms, creations of the human spirit. Algorithms and AI are what we make them and they will be what we want them to be. It’s up to us to choose the world we would live in. And I hope that this point of view resonates with the engineers in the room, and I’m one of them — we don’t just take life as it comes with us.”

Domino Data Lab paid travel and accommodations for The New Stack to attend Rev 4.

Group Created with Sketch.
TNS owner Insight Partners is an investor in: The New Stack.
THE NEW STACK UPDATE A newsletter digest of the week’s most important stories & analyses.