Brian G Herbert
19 min readApr 3, 2023

--

Perspectives on new AI capabilities and what is now possible (and likely)

Open AI’s release of GPT-3 (Generative Pre-trained Transformer architecture for large language model AI) in November will be seen as a pivotal milestone in the history of AI. Widespread public excitement over Chat GPT triggered an acceleration of plans and PR from partners and competitors. Before continuing forward, I felt the need to consolidate feedback on the latest capabilities of large language model AI and how it has altered what is possible.

I allocated time each day for a couple of weeks to review podcasts, articles, discussions, research papers, news, and actual products. I've also experimented with multi-modal generative AI by generating text, python code, and images. I've included links to the content that I found most helpful. I've also included my perspective on

Optimistic evangelists predict that AI is about to be an indispensable co-pilot, our augmented intelligence, or our real-time assistant and tutor. Cynics predict a dystopian nightmare where a machine acquires generalized intelligence (aka "strong AI" that can perceive reality as a human) and then acts in ways that harm or even kill us. For an excellent discussion on narrow versus strong AI and tradeoffs with neural network training mechanisms like transformers, listen to Eye on A.I. interview with Yann LeCun (NYU and Meta Chief Scientist) which I discuss later.

I'll go with AI as a benign enabler. While LLM AI is not narrow but general purpose, it does not depend on artificial general intelligence (agi) and even if an LLM AI did target agi, it is a long way off with many more proximate threats to deal with first. The reason for optimism is that AI developments to me are an accelerator of trends in learning and personal empowerment that have progressed over the past 20+ years. We can develop new expertise and transform our lives more effectively than ever. Restricted only by our initiative (and ability to ask the right questions!) any of us can learn a new language, create a professional mix from our audio or video recordings, become a master chef, or provide expert care for a family member with an illness, just to name a few. There has been a revolution in the performance and longevity of athletes due to advances in nutrition; innovative and efficient training and conditioning approaches; and monitoring and analytics. An AI co-pilot for work will be like an athlete having constant access to the best training and monitoring.

Partner with your co-pilot, or you may find yourself irrelevant!

With text generation, GPT saves time and allows me to target areas where I can add value such as with creative, specific, or the most recent examples. I didn’t use GPT for this article, but I have used it to generate blocks of text which I then customize. It is a great timesaver and helps with consistency and flow on longer pieces. On the downside, it produces somewhat homogenized text and is prone to occasional “hallucinations” so fact-checking is essential. I have no doubt Open AI and other providers are figuring out how to minimize the occurrence of hallucinations.

Multi-modal AI refers to additional forms of output beyond the generation of text. The most exciting output for me is generation of Python code. With careful prompting, GPT-4 is capable of writing and deploying functions with documentation. I was blown away by the time savings when I got a call…

Code Generation

An experienced recruiter wanted to ask me, “In what programming language do you have the most fluency and expertise?”

I said: “English”!😆

In “The Future of GIS with ChatGPT” by Morad Ouasti, he has GPT implement a visualization in code and implement it in his GIS system without touching a single line of code. It illustrates the importance of logically working through what you need and being able to communicate clearly and without ambiguity (the Prompt Engineering part). I read this article and thought- this is the new level of productivity and reach that is possible with these tools, and it will change how progressive companies identify value in a potential employee.

I’ve seen so much time and money spent turning a language that people understand into a language that computers understand, and often critical business needs get lost in translation. I understand there are factors like testing and security that make it difficult to automate or streamline processes, but for my work on data analytics projects, code generation is a viable and productive tool.

For data pipelines, code generation can be combined with offerings that support fast deployment and low overhead (such as cloud tools from modal.com). My goal is to spend 100% of my time analyzing a business scenario and results. I believe for an analyst or developer, multi-tasking is the enemy to productivity and error prevention! I am working on combining code generation with other tools that minimize or eliminate task switching.

Last week I asked GPT-4 to create a python function for me so I could send prompts to the AI’s API directly from my Python scripts. GPT-4 also writes good documentation- it gave me steps to get my API access key and pass it in my calls to authenticate me as a user. I’ve worked enough with Swagger and Postman and other API tools that I knew I’d need an API key, but I forgot to ask GPT-4 for that piece. GPT was like, “Hey, Brian, you didn’t ask me but don’t forget your authentication key!” An assistant that anticipates my needs, nice!

AI allows us to better match our effort with value and/or enjoyment. I am already using AI to automate or bypass low-value activities and to enhance my performance of high-value activities. Unlike the pre-AI search tools we have used for many tasks, AI-based chat will produce less noise to sort through to get what we want, and will increasingly be able to not just return information but get things done (execute tasks). In the March 15th interview with Bryan McCann from You.com on the Practical AI podcast, he explains current and planned capabilities for his company's you.chat AI-enabled product. The net impact that I anticipate as I've reviewed scenarios in several different vertical industries is both employees and customers will be better prepared and focused. This will lead to higher quality interactions and an expectation among consumers that will penalize companies whose employees do not make good use of AI assistance.

The Known Unknown?

According to Donald Rumsfeld, known-unknowns means we know there are things we do not know.

We know there will be enhanced security and oversight needs with AI and there will likely be nefarious ways people will try to exploit these systems, but we don’t have a full model for the threats based on actual experience. I believe existential threat from a generalized intelligence, the dystopian doomsday AI, is very unlikely, particularly in the next few years. We need to focus on combatting near-term known unknowns- we know cyber-security and fraud threats in online systems and need to apply what we can anticipate to AI.

There is also the threat from harm done by hallucinations- when the AI returns an absurd assertion or tries to make a nonsensical claim. Open AI has reportedly made progress with new training techniques that reduce the incidence of this. This is the non-intentional aspect of the threat, and the former is the threat from intentional actions.

Several of the experts I’ve listened to in recent weeks were pragmatic in predicting threats would be similar to cyber-attacks, fraud, and espionage that we have been dealing with in online systems for years. Some mentioned that threats are likely to continue to emanate from the same types of sources- people with bad intent and a disregard for the harm they cause- that we control for with cyber-security today.

This past week hundreds of scientists and entrepreneurs signed a letter calling for a 6-month pause on training or releasing new AI models beyond OpenAIs GPT-4 and DALL-E (pronounced “dolly”, OpenAIs image generation module). It could have to do with wanting some time to make sure we have a comprehensive model for threats (define our known unknowns) and also understand the most effective model training to minimize AI hallucinations.

As minimizing hallucinations is good PR for all AI companies, perhaps the six months can be used to share techniques for this. In many areas such as sharing ideas from research papers, there continues to be collaboration that looks more like a scientific field than a domain that is being competed over by for-profit companies! At some point I would expect to see the protection of trade secrets and non-disclosure become a priority as it is in most commercial technologies, but there seems to be an understanding that AI is still working through an R&D phase where collaboration benefits all. It has made it easier for any interested party (like me) to stay up on papers, open source packages, and even demo releases to keep a pulse on the latest AI developments.

My Career Progression to Data and ML/AI

In 2017 I departed from the years I had spent with telecom OSS/BSS software to focus fully on data science and AI. I took university certification courses and contributed to volunteer projects as well as built custom systems independently. It may seem ironic that I spent a lot of time over the last several years building my python expertise and now I use a code generator! But learning python allowed me to train models and understand the details of NLP, machine learning, and AI techniques. I downloaded code from papers and from open-source libraries to understand the concepts involved. Even if I don’t write another line of Python manually, there was value in my work.

I began my career learning to code on personal computers and developing systems for mobile phones and the Internet. Those were the first three disruptive technologies with which my career intersected. In 2017 when I shifted my focus I anticipated that data science and AI were going to be even bigger than the past disruptive technologies on which I worked. I’ve been anticipating the boom in AI for 7 years, so I may not be the most objective evaluator of the opportunities versus threats.

Overviews of LLM AI and Critical Components

In addition to podcasts I searched AI-related articles on tech sites like Wired or Hacker News and VC news sites like Crunchbase and KDNuggets. Sites like Wellfound, The Muse, and Built In that have a startup-orientation are also good sources for the latest AI developments and demand for AI skills at startups.

What are the components to LLM AI and what is responsible for recent breakthroughs in performance? What are the business opportunities in the larger AI “ecosystem”? And finally, how is AI likely to change jobs and society over the next couple of years?

Let’s start with a few overviews of LLM AI. This first one is an overview of NLP and LLM AI developments by Yossi Motro of tasq.ai. He provides a brief but clear evolution of NLP and training of models including the Transformer approach that the industry has found to be the current best practice.

The second is an explanation of how GPT-3 and 4 were trained, written by Molly Ruby for the group Towards Data Science on Medium. She uses a technique that I think we’ll see more of- she prompts GPT to provide answers about how it was trained. Like asking a more specific version of, “tell me how you came to be!”, and GPT is good at explaining this!

As I mentioned earlier, this space has an interesting feel that blends collaboration on theory that is seen in scientific fields among academics, code and data sharing that is common in open-source software, plus the commercial interests of for-profit companies. I’ve spent years in enterprise software development and there is more knowledge sharing than I have ever seen, but this may change as companies turn from R&D to commercial offerings from which they have ambitious revenue targets.

To stay up on things it helps to be tapped in to all three of these channels. I’ve integrated open-source packages from GitHub or Pypi.org, particularly for NLP or ML projects and that got me familiar with training algorithms and hyper-parameters that are used to control learning. To understand transformers, attention, and queries, keys and values — the lower-level building blocks that have allowed these models to understand the context of language- working through research papers, which have lots of math and often include code, is the best way to understand what is going on. The Google Scholar extension shows the number of times a paper has been cited by others. If you review just one paper, the best one is Google’s 2017 paper on transformers as it is the foundation for a lot of what is going on today.

An Interview of General Catalyst’s Deep Nishar on Crunchbase identified four key areas responsible for the progress with LLM AI:

1. The advancement of algorithms or ML techniques. This includes concepts like attention and self-attention that drive output values in neural nets. It also includes the Transformer algorithm, which Google invented in 2017 and is integrated into just about every significant AI architecture these days. Nishar says that Google and OpenAI have originated about 50% of the AI algorithms in use today.

2. Huge amounts of data are needed to train AI. Advances (such as not requiring as much labeled data for training) has allowed more of this to be automated. But, it requires a company that can make huge investments in data storage and processing to build an LLM AI.

3. Computational power has gotten better and better, which has enabled some of the progress in AI we’ve seen. Particularly the integration of fast GPUs in processing have accounted for big gains in the speed of AI.

4. Nishar says that for years there were only a few hundred people across Google, DeepMind, Facebook Research, OpenAI, and Apple who really understood these models well enough to build them. Now there are thousands and include other companies as well as university researchers, so the AI knowledge base is rapidly spreading.

Based on Nashir’s comments we can see that only a few, well-funded companies are capable of leading in the LLM AI space. Besides Google (and DeepMind is now owned by Google), OpenAI, Apple, Meta/Facebook, Amazon, Salesforce and Microsoft, there are also companies with significant venture funding such as Anthropic, Cohere, Adept AI, Inflection AI and Character.ai.

The Cognitive Revolution podcast hosted by Nathan Labenz and Erik Torenberg.

In episode 2 the hosts describe how their experiences motivated them to start this podcast and name it as they did. They explained that AI is bringing about a cognitive revolution in terms of what each of us are capable of learning and doing. We are in a period where people can earn degrees or new skills that they once thought were beyond their capabilties, and information technology and AI has enabled these things.

Nathan explained: “prior to the agricultural revolution, we felt we had to use our muscles to get food and survive. Then we began farming and using tools and beasts of burden. After that we discovered coal then oil which provided us more concentrated energy than our own muscles. Now AI is further leveraging what we can do with external sources of energy, knowledge, and intelligence. AI extends what we thought we could know and accomplish.”

A few weeks ago, I wrote an article on Medium titled, “An Inflection Point in our Cognitive Growth” so I was curious about the overlap with the title of this podcast. My point was that the availability of AI or human experts should not cause us to get lazy in working through objective analysis of situations. I also try to overcome bias and other logical fallacies, and I think mental practice is important personally and for society, even if sometimes it may seem redundant, we also sometimes uncover errors or oversight by doing it. I was arguing for the augmentation model in which we enhance our capabilities through AI but we continue to move human intelligence forward!

Practical AI podcast hosted by Chris Benson and Daniel Whitenack

Practical AI, Eye on AI, The Cognitive Revolution, and AI and the Future of Work are my four top podcasts on artificial intelligence. Chris and Daniel are detailed and professional hosts who bring a lot of detail from their professional experience- Chris with big military and government initiatives and Daniel more in private sector entrepreneurial areas, which is a strong combination. They cover an incredible range in the space from specific companies to intriguing issues like collaboration (such as coverage of HuggingFace.com), privacy, geopolitics, or cultural differences in the use of language and lesser-known implications with multi-modal AI implementation.

If you want to listen to one episode that is relevant to current LLM AI developments, go to March 15th’s interview with Bryan McCann from You.com. There is a lot of information about integrating front-end user systems with LLM AI and the various considerations.

Morgan Stanley investor conference- Greylock Partners with Umi Mehta

At Morgan Stanley’s annual conference, there was a discussion with Umi Mehta and two executives from Greylock partners. Mehta is global head of tech private equity and venture capital investing at Morgan Stanley. He said that even today, most people really don’t understand the power of search to get answers to what they need to do. Often it is simply a matter of adding a few extra clarifying words to their query. Mehta said we should think about generative AI in this way: within 2–5 years, every profession will have an AI co-pilot that is anywhere from helpful to essential to performing their job. This is one of the best discussions of the impact of AI on professional jobs that I have heard.

“Regulation is an inferior model to auditing and incenting- guide AI companies towards what we need but don’t lock them down, it will be too slow and expensive. Let’s list our dystopian fears, figure out how likely they are, and have a dialogue (such as between government and big AI companies) to ensure incentives and penalties steer things in the right direction. As far as the use of individual content — we often contribute content to the public domain for a variety of personal or commercial reasons. There has already been a transaction for which we have received or gambled on receiving value- such as sales leads or likes or simply access to these systems. The result is this content became part of the public domain and is used in the training of LLM Ais. Right now we are all receiving some incremental value from this aggregate public domain content. This is why monetizing the contribution of content to LLM AIs doesn’t fly.

Mehta: “I envision an AI tutor and an AI doctor on everyone’s mobile phone. If we slow that down with regulation that think of the cost in aggregate human suffering. We need to embrace these developments consistent with an ongoing dialogue to ensure the macro model has the right incentives and penalties to prevent bad actors from causing harm with these capabilities.

“AI will be our personal co-pilot”: there are many AI apps that will simply use the mobile device as the UI. These are apps where a latency in the hundreds of ms is OK. This will not be ok for something like driver assist, but for a lot of job-enhancement co-piloting, it will be fine.

AI and the Future of Work podcast hosted by Dan Churchin

Augmented Intelligence is how Bob Rogers says we should refer to Artificial Intelligence. Each of us has the opportunity to augment the skills that we bring to every activity in our lives, particularly our jobs. On the podcast AI and the Future of Work, Rogers was interviewed about a range of AI issues.

Rogers believes that historically, information technology has not eliminated many jobs. It is probably more correct to say that it typically augments jobs. Systems are used to do work better and/or faster. This adds to the work that humans must do or oversee, and humans find more work for the systems to perform. As the cycle turns, both humans and computers have more work to do!

The big problem so far has been the tendency for these systems to Hallucinate, or sometimes just makes stuff up or argue in favor of something that is clearly nonsensical. Part of overcoming hallucination is the responsibility of continuous improvements in training that AI companies are doing. Another part of eliminating hallucination is a new skill termed Prompt Engineering.

Prompt Engineering involves understanding the language and parameters that can be sent to an AI as a request. This is emerging a skill that is critical to reduce hallucinations and return the highest quality results.

A Long Way from being Sentient!

There was a famously bad social media post by a Google engineer last summer in which he claimed that Google Lambda was sentient (consciousness in terms of understanding human feelings and intent). Google suspended him and recanted his statement, but it led to famous philosopher and human consciousness expert David Chalmers to hold a conference on AI Sentience!

The Eye on A.I. podcast hosted by Craig S. Smith had a long interview with Yann LeCun (NYU Professor, Meta Chief Scientist, originator of pioneering ML and AI technigues) on contrastive self-supervised machine learning, joint embedding predictive architecture (JEPA), and LLM. LLM models lack objectives to guide generative tasks and do not understand the reality that language is used to describe.

Most human knowledge has nothing to do with language, language is a layer on top of a massive base of knowledge that we often call common sense. Babies acquire knowledge of basic physics by observation prior to having any ability to interact with language. With transformer-based approaches, input gets tokenized and much gets lost in the translation. Convolutional nets don’t do this, so some combination of the two may be needed to allow AI to learn about the reality of our world from observation.

Setting goals (aka objectives) will almost certainly require an AI to have a base of common-sense knowledge likely acquired through observation. LLM AI can’t get there, at least not without rethinking some core pieces about how they are trained. LeCun makes a compelling argument for the understanding of objectives that would be possible with some alternative learning techniques that are being explored. LeCun is a fascinating speaker with knowledge of this field that goes back to being one of the early innovators, so I highly recommend listening to this episode!

Supplementing LLM AI with live/current, creative, and/or novel data

GPT-3 was built by consuming a massive amount of public data from the web, known as the 2021 Common Crawl. Any LLM has knowledge up to the cutoff data of whatever data archive it was trained on. That means that it will not be able to answer question on something more current, like the latest products or fashion trends.

There is also a degree of homogenization that an LLM AI does. LLM AI has an outstanding knowledge of current language usage that is very professional, but it can lack some variety and will trend towards the most referenced or most common.

Perhaps the biggest business opportunity is for companies to supplement LLM AI with supplementary models or data that is more recent, more specific to an industry or domain, or contains more diverse examples or creativity. For example, if you want to include the latest fashion or products in results, the LLM AI does not contain any data more recent than the cutoff date of the archive content that was used to train it.

Rethinking How Human Talent is Evaluated

Instead of thinking in terms of AI eliminating jobs, I think the much more common scenario will be job augmentation/enhancement via an AI co-pilot. I began thinking about how this will change the job market starting with how job listings are written. Often prerequisites listed in job listings today are outdated since they can be learned just-in-time and arcane and/or clerical skills are hardly a value-add differentiator of talent!

Some of the tasks that have defined a job until today may be anachronistic and can be bypassed by prompting or direct task completion via AI. For example, I have expert skills with many office tools but much of my knowledge even of advanced commands is trivial relative to the value I bring due to other attributes. Companies that can abstract up a layer to identify true differentiators of value will gain competitive advantage in the market for resources.

The introduction of AI is an opportunity for companies to re-assess their value chain- particularly human contributions to business value. Less emphasis on job prerequisites that have become trivial can allow the business to focus on true differentiators of value in candidates. There are a few problems with lists of prerequisites in job listings like “X years with YY tool”:

1. It assumes the next candidate should do the job as it has been done, which is now probably an inefficient or obsolete method.

2. Some job prerequisites become trivial with AI assistance, as they can be learned just-in-time or bypassed through AI task execution or re-engineered workflows.

3. They emphasize that employee value is differentiated on the arcane and clerical, neglecting higher-value attributes (like aptitude, values, motivation, logical reasoning, and communication skills- which will be vital for prompting AI systems).

Companies that rethink candidate and employee value with the presence of AI job augmentation will attract the most valuable candidates. They will also reduce the incidence of false negatives caused by mis-prioritized job requirements.

Energy Efficiency

We are still relatively dumb about how our minds work and how computers can learn most efficiently. Our ignorance translates into a high cost to get machines to think- the cost in electricity and hardware resources to train the latest version of LLM Ais from a handful of companies is enormous. As innovative as researchers have been, we will likely climb a learning curve that leads to far more efficient learning several years from now.

Human brains consume an outsized amount of oxygen and calories for conscious thought. Computer ‘thinking’ is similarly expensive. I assume our algorithms will become more efficient at learning and when combined with Moore’s Law this would indicate the energy costs to train these systems will decrease. I’m making a few logic leaps comparing brains and computers as they are quite different, but it makes sense that training these models will become more energy-efficient. Experts have found that as they revise algorithms, they often not only produce better results but also become more efficient.

Is the Public Ready for AI?

In my article, “Fear in the Evening of the Platform Lifecycle”, I reviewed the warning signs of user platforms that are going into the twilight of a cycle, and how there seemed to be a dearth of innovation from big tech. Then bam, AI suddenly got real in the minds of the general public. Even more than technical breakthroughs, public excitement is the best development that could drive a new growth phase at Google and others, because it makes demand-pull much more likely. As I mentioned in my article, the subsidized, advertising-based model of these platforms has led to increasing churn and rising user complaints, as could be predicted. Public excitement that creates demand-pull is a better proposition for new growth and might be a way to forget current user dissatisfaction and shift to a new paradigm.

After all the interviews and articles I have taken in, I am not sure if the proposed 6-month hiatus on AI releases is needed. There are enormous benefits and productivity gains that companies can begin to accrue now. There are also benefits to running with public momentum rather than trying to put the brakes on a force that can create jobs and new revenue. I have personally been able to increase my productivity with AI and I have barely invested in the knowledge to do so. I would guess that a 50% increase in productivity for a team from the use of AI-based job assistance tools is a reasonable target within the next year.

--

--

Brian G Herbert
Brian G Herbert

Written by Brian G Herbert

Award-winning Product Manager & Solution Architect for new concepts and ventures . MBA, BA-Psychology, Certificates in Machine Learning & BigData Analytics

No responses yet