ZhenFund

Here are the 78 New AI Products from the Second Half of April!

2023.05.09

In the second half of April, new AI products continue to emerge in the field. More and more capable players are joining the open-source battle, and both general and vertical scenarios are witnessing the emergence of many useful new products. In the realm of large companies, previously uncommon players like Apple, Palantir, and Sberbank have also joined the wave.

The investment team at ZhenFund has continued to compile a list of 78 AI new products from the past half-month. We hope it brings you some inspiration, and we welcome your thoughts and discussions in the comments.

- 15 new products from startups - This time, we have categorized these products from the perspectives of "Open Source" and "Closed Source." Among them, 8 are open-source, and 7 are closed source. It's great to see more and more capable players joining the open-source battle. May the source be with you.

- 39 useful and 10 fun new products - Don't be intimidated by the number!

In the "useful" section, we have categorized the products into "General Scenarios" and "Vertical Scenarios." We have also created separate categories for high-interest topics such as "Model Training," "AI Safety," "Code Learning," and "AI Agents."

In the "fun" section, the recommended products have their own unique features. For example, there is a movie search tool developed by Andrej Karpathy himself, dream generation and storage, and a virtual music radio station that has risen to prominence amidst the recent buzz around the "real vs. fake Drake" case across the ocean...

- 14 new products from large companies - In addition to familiar faces like Microsoft, Google, and NVIDIA, we also have rare appearances from friends like Apple and Palantir, as well as a newcomer from Russia, Sberbank.

The AI world is enormous fun.

Enjoy!

download_image (5).png

From Startups

download_image (6).png

Open Source

Stability AI

Stability AI is truly an exciting startup that continues to surprise people. They have made a name for themselves in the field of image generation and are now solidifying their leading position in the Gen AI domain with a language model.

StableLM

StableLM is Stability AI's own language model, and they are open-sourcing it for developers and commercial users. Their motto is "Transparent, Accessible, Supportive."

Currently, StableLM has 3 billion to 7 billion parameters, and they plan to release models with up to 65 billion parameters in the future, trained on 1.5 trillion tokens. If I'm not mistaken, this would be the largest open-source language model to date, right?

Link - https://github.com/stability-AI/stableLM/

StableVicuna

After the release of StableLM, Stability AI has also open-sourced a chatbot called StableVicuna, claiming it to be the AI world's first large-scale open-source RLHF LLM chatbot. It can be experienced on Hugging Face.

Link - https://huggingface.co/spaces/CarperAI/StableVicuna

DeepFloyd IF

The release of DeepFloyd IF is the moment we have all been waiting for—an opportunity to render text in generated images (although there is still no news about advancements in finger generation)!

Link - https://huggingface.co/spaces/DeepFloyd/IF

Hugging Face

StarCoder

"May the source be with you!"

If we advocate for open-source development, then Hugging Face should be another leader—they are truly active community promoters!

On Star Wars Day (May the 4th be with you), HF partnered with ServiceNow to release StarCoder, a fully open-source code generation model.

download_image (7).png

For programmers, this is truly exciting news. Hugging Face even claims that its performance exceeds the model used to train GitHub Copilot by OpenAI.

From a data ethics perspective, StarCoder was trained on an open dataset called "The Stack," which consists of 19 million fully open-source code repositories and 6TB of code. But the best part is that it can be integrated with VSCode, enhancing the coding experience.

Link - https://twitter.com/BigCodeProject/status/1654174951006404610

Hugging Chat

Hugging Face couldn't resist joining the development of chatbots either.

They have released an open-source chatbot called HuggingChat, which comes with a web interface and API. While it may not reach the level of chatbots developed by big companies, it's worth noting that it is available for free.

Let's see how HuggingChat performs when combined with StarCoder!

download_image (2).jpeg

Link - https://huggingface.co/chat/

RedPajama by Together

The determination of the open-source community to compete with tech giants is truly inspiring. The latest initiative from Together called RedPajama sets bold goals to drive progress in open-source models. Their aim is to compete with big companies by offering fully open-source and reproducible leading language models. This goal is divided into three steps:

1. Dataset creation

2. Training a foundational model

3. Achieving fine-tuning through instructions

They have announced the completion of the first and part of the second phase, which involves creating a 1.2 trillion-word dataset for training models similar to LLaMA. They have released the RedPajama 3B and 7B models and achieved fine-tuning through instructions on OpenChatKit. It feels like a significant event is brewing!

Project - https://www.together.xyz/blog/redpajama

Models - https://www.together.xyz/blog/redpajama-models-v1

WebLLM by OctoML & CMU

While chatbots and AI agents are undoubtedly major trends, they represent only small steps forward on our AI journey. We can expect more of these "small steps." In my opinion, enabling AI models to run locally and in web browsers is one such step.

WebLLM, developed under the leadership of Chinese scholar Tianqi Chen, is an excellent attempt in this direction. It allows us to run LLM in the browser without the need for server support. Currently, the selected model is vicuna-7b-delta-v0.

download_image (8).png

Famous developer Simon Willison has extensively documented his testing process of WebLLM using Chrome Canary on his blog. The summary of his findings is as follows:

- WebLLM exhibits impressive speed, processing approximately 15 tokens per second, surpassing the performance of other models Simon has tested on his personal devices.

- Simon tested the model's question-answering capabilities through a series of questions, including fact queries, list generation, text summarization, generating puns, and writing code. While there were some incorrect answers, the overall performance was commendable.

- Despite some flaws in the model, Simon believes it meets his expectations and can be used as a building block for various practical tools.

Link - https://mlc.ai/web-llm/

Simon Willison's Blog - https://simonwillison.net/2023/Apr/16/web-llm/

Phoenix by Arize AI

Arize AI has launched Phoenix, an open-source library for monitoring the hallucinations of LLMs. It is the first software designed to help data scientists visualize the decision-making process of LLMs, monitor their generated content, and propose remedies in case of false or misleading results.

Phoenix easily handles unstructured text and images and utilizes embedding and latent structure analysis as its foundation.

Link - https://phoenix.arize.com/

download_image (9).png

ClosedSource

Pi by Inflection

A powerful newcomer in the world of chatbots!

— Yes, another chatbot, but this one is truly remarkable. From its UI design to its manner of expression (it can even synchronize with four different voices for audio output), it is truly eye-catching!

Engage in casual conversation with the chatbot for an hour straight, and the enjoyment will continue uninterrupted.

download_image (10).png

In addition to that, Pi has the following noteworthy features:

- Founding Team: It includes Mustafa Suleyman, co-founder of DeepMind, Reid Hoffman, co-founder of LinkedIn and board member of OpenAI, and scientist Karén Simonyan, among others.

- Personalization Potential through Long-term Memory: Pi aims to become a personal chatbot that can grow into a personalized virtual companion over time. According to Forbes, Pi can play the role of an active listener, engaging in ongoing conversations with users to discuss or solve problems while gradually understanding them through remembering previous interactions.

- Cross-platform Interaction and Memory: But the really cool feature is that users can interact with their chatbot across various platforms, and it will remember the interactions with them!

- Future Development: Inflection claims that this is not even their most powerful model...

Link - https://heypi.com/talk

Khanmigo by Khan Academy

During a TED talk on May 2nd, Sal Khan, the founder of Khan Academy, demonstrated their latest AI tool called Khanmigo, which serves as both a mentor for students and an instructional assistant for teachers. From the demonstration, it appears to be a positively attituded and adaptive educational product that helps students identify mistakes, guides them towards better problem-solving approaches and learning methods, role-plays as a history professor, and even serves as a strong debating opponent. It has comprehensive capabilities that align with my personal expectations for an educational product. Remember Greg Brockman mentioning Sal's generous offer to provide vertical training assistance for ChatGPT in the education domain during TED 2023? Khanmigo seems to be a manifestation of that.

download_image (11).png

Link - https://www.khanacademy.org/khan-labs

Brand Voice & Memory by Jasper

Jasper has introduced a new feature called Jasper Brand Voice, where users can provide factual information about their company, product catalog, target audience/customers, brand tone, and style to the AI to ensure that the generated content aligns with the brand's voice. Jasper can also directly access the user's website to gain a better understanding of the brand and match different styles that align with the brand's tone. Additionally, Jasper Brand Voice retains a history (Memory) of the user-uploaded information mentioned above to ensure that the AI can consistently and accurately generate content about the company.

download_image (12).png

Link - https://www.jasper.ai/products/brand-voice

Multilingual v1 by Eleven Labs

Last weekend, a viral video on Reddit and Twitter featuring a natural English-German bilingual clip with a cloned David Attenborough voice gained significant attention. The mastermind behind the video was Eleven Labs' new multilingual model, Multilingual v1.

This model possesses powerful text understanding capabilities and rich emotional expression. Currently, it supports eight languages: English, French, German, Hindi, Italian, Polish, Portuguese, and Spanish. Additionally, the model has the ability to recognize multilingual text and convert it into speech. Users can generate multilingual speech using single-word prompts while maintaining the unique voice characteristics of each speaker.

download_image (13).png

The new model is already available on the Eleven Labs Beta platform, and users can select it through the dropdown menu in the speech synthesis interface.

Link - https://beta.elevenlabs.io/blog/eleven-multilingual-v1/

Parrot by Play.ht

Coincidentally, Play.ht has also launched their model called Parrot, which supports multilingual synthesis and cross-language voice cloning. Similar to Multilingual v1, Parrot allows users to clone voices across different languages while preserving the original accents and subtle linguistic differences. For example, users can upload a 30-minute audio in Spanish using Play.ht's voice cloning service, and the model will clone the voice and language, enabling the Spanish-speaking user to speak English using Play.ht's TTS software. The software will read the text in the voice of the original audio (but in English) while retaining the Spanish accent and speaking style. However, unlike Multilingual v1, Parrot supports the conversion between over 130 languages but does not support voice conversion for mixed-language texts.

Additionally, Parrot is an upgraded version of the voice model Peregrine, which Play.ht released in September 2022. Compared to Peregrine, Parrot offers more similar pitch, rhythm control, and zero-shot cloning capabilities. It can capture and mimic the intonation and subtle differences of the original audio language and apply them to the cloned language, enabling seamless cross-language voice cloning.

Link - https://play.ht/blog/play-ht-launches-multilingual-synthesis-and-cross-language-voice-cloning/

iOS App by RunwayML

RunwayML has released an iOS application with the same name, which can be seen as the commercialization of RunwayML. Users can generate and edit videos based on Gen-1 models. Currently, the app can only create trendy but slightly eerie videos based on existing videos, and the text-to-video feature will be launched later. However, there is a limitation: the free credits are limited, and the price after that is quite high.

Link - https://apps.apple.com/us/app/runwayml/id1665024375

Twelve Labs

A video search tool invested by Fei-Fei Li and the founder of Scale AI, Alexandr Wang. It allows users to find the most desired moments in hours of video by describing the desired content!

download_image (14).png

Link - https://twelvelabs.io/

download_image (15).png

For Money

download_image (16).png

General tools

Klu

Since the release of the ChatGPT Plugin, turning tools into "decision centers" has become one of the hot areas of exploration for developers. Here, we introduce Klu, which is designed to connect various commonly used applications such as Gmail, Dropbox, Notion, Slack, and more. It enables seamless and unified enterprise-wide information search through a question-and-answer format.

download_image (17).png

Link - https://klu.so/

openpm.ai

After discussing the "decision center," let's take a look at another unofficial definition of the ChatGPT Plugin as an "app store." Reflect Notes founder Alex has built openpm.ai with the goal of preventing monopolies similar to the Apple Store in the AI field.

openpm is an open-source package manager for OpenAPI files. AI tools, like the ChatGPT Plugin, can use packages from openpm, which means they can automatically discover and interact with the world through APIs.

Creating a fully open-source package manager for OpenAPI files means that any application/website (AI tool) with an API can access and use packages from this platform instantly. It can be considered as a free version of the Plugin protocol, and we eagerly await OpenAI's response.

Link - https://openpm.ai/

CodeDesign AI

CodeDesign is an AI-powered website builder that allows users to create websites in seconds using AI-generated UI elements. It provides full customization options while also offering intelligent suggestions from AI.

Currently, the product offers cloud hosting, SEO, and database functionalities, allowing users to publish to their own domain or export the code. Additionally, it includes an edge feature for generating marketing copy.

download_image (18).png

Link - https://codedesign.ai/

In addition to CodeDesign, there are two relatively simple website building tools:

✅ Levi by Style AI

Build fully customizable, SEO optimized, and ready-to-launch websites in just 60 seconds with Levi by Style AI.

Link - https://usestyle.ai/

✅ Landing AI

Explain your product, brand, and create unique landing pages using Gen AI with Landing AI.

Link - https://landing-ai.com/

Checksum

After generating your website/application, try out end-to-end user testing with AI using Checksum!

download_image (19).png

Link - https://checksum.ai/

LLM Report

In simple terms, the LLM Report is an analysis report on the usage of the OpenAI API, detailing how much money has been spent and where it has been allocated.

download_image (20).png

Link - https://llm.report/

download_image (21).png

Vertical Tools

Flux Copilot by Flux.ai

Can generative AI be used for hardware design? Flux.ai, a PCB design software company, provides us with an answer.

Flux.ai positions Flux Copilot as an "AI hardware design assistant" that aids in schematic design, exploring new concepts, generating bill of materials, and conducting reviews and validations. It helps PCB designers improve efficiency through design optimization, productivity enhancement, community data/experience search, simplified procurement, innovative design exploration, and collaborative optimization.

download_image (22).png

However, the company repeatedly emphasizes that Flux Copilot, like LLM, is "not entirely trustworthy" and should only be regarded as a designer's "guide" rather than a "substitute for professional knowledge."

Link - https://www.flux.ai/p/blog/flux-copilot-the-first-ai-powered-hardware-design-assistant

ArXivGPT by Marco Mascorro

ArXivGPT is not a standalone product, but an automated Twitter account created by Marco Mascorro, co-founder of Fellow AI. It utilizes the GPT-4 API to automatically gather and summarize the latest papers in the fields of AI, CL, LG, CV, and NE. Let's take a look at the summarization results!

download_image (3).jpeg

Link - https://twitter.com/ArXivGPT

Dr. Grupa

After regaining his freedom, Martin Shkreli, known as Pharma Bro, is venturing into new business endeavors. This time, he has developed a medical chatbot called Dr. Gupta, claiming it to be the "world's first doctor chatbot" and envisioning it to become a "replacement for all healthcare information." Currently, it doesn't appear to be a revolutionary product, but it is bound to inevitably trigger a discussion on security, ethics, and privacy.

download_image (23).png

Link - https://www.drgupta.ai/

download_image (24).png

Model Training

Chatbot Arena by LMSYS

LMSYS has launched Chatbot Arena, as the name suggests, a "model arena" where users can engage in conversations with two anonymous models simultaneously and vote for the one they find better. The functionality is straightforward and allows users to compare and evaluate the performance of the models.

download_image (25).png

I initially thought it was an interesting little experiment, but on May 3rd, the team released a serious and professional report explaining the rationale behind evaluating LLM using such a system.

- Scalability: When it becomes infeasible to collect data to evaluate all possible model pairs, the system should be able to scale to a large number of models.

- Incrementality: The system should be able to evaluate new models with relatively few experiments.

- Unique order: The system should provide a unique ordering for all models. Given any two models, it should be possible to determine which one ranks higher or if they are tied.

They also published rankings of several open-source models up to the present, and it appears that the Chinese model ChatGLM is performing well.

download_image (26).png

Link - https://lmsys.org/blog/2023-05-03-arena/

Lamini

Lamini aims to simplify the LLM training process for engineering teams while improving the performance of the trained LLM. Using a few lines of code in the Lamini library, any developer (not only those skilled in machine learning) can train an efficient LLM with the same performance as ChatGPT on huge datasets.

download_image (27).png

A few examples for easy understanding:

- ChatGPT prompt word optimization and model switching. First, the team provides the best prompt words for different models for users to use. Secondly, the API of the Lamini library can be used to quickly adjust the prompt words of different models. Finally, with one line of code, you can switch between OpenAI and open source models .

- Generate large amounts of input and output data. The data will show how the LLM responds to the data it receives, whether in natural language (English) or JSON format. The team released a repository of 50,000 data points generated with a few lines of code from the Lamini library — generated with only 100 data points.

- Adjust the original model with generated data. In addition to the data generator, they share a Lamini-tuned LLM model trained on the generated data.

- Subject the fine-tuned model to RLHF. Lamini avoids the need for large-scale machine learning (ML) and human labeling (HL) workers required to perform RLHF.

- Upcast the LLM to the cloud. Just call the API's endpoints in your application.

download_image (28).png

Link - https://lamini.ai/

download_image (29).png

AI Safety

Trustible

Supervision and responsibility may sound boring, but it is very important. At least we should pay more attention to the harmonious coexistence of man and machine.

Trustible is a start-up company based in the United States. They are the first to provide enterprise-oriented services to help companies practice compliant and responsible AI practices when implementing and deploying AI models. The product aims to align enterprise AI products with relevant regulations to Achieve compliance while also staying up-to-date with key new regulations.

download_image (30).png

Link - https://www.trustible.ai/

SafeGPT by Giskard

As the name suggests, SafeGPT was born for the safety of LLMs, and is used to identify and solve errors, biases, and privacy issues in LLMs. Its main features are as follows:

- SafeGPT works with all types of LLMs, including ChatGPT, and uses real-time data to cross-check with external databases, thus comparing answers to check their accuracy;

- SafeGPT also provides enterprise-level features to ensure the safety of LLMs, and the flexible serverless backend architecture can handle billions of requests per day;

- SafeGPT also prioritizes privacy and security, offers local installation options and encrypts data, and complies with regional regulations.

download_image (31).png

At present, you need to join the waitlist to obtain trial qualifications, free for individuals, enterprises, and paid~

Link - https://www.giskard.ai/safegpt

The AI Incident Database

Ensuring the safety of LLMs not only requires us to prevent from the development side, but also needs to keep abreast of their negative cases. In the previous Newsletter, we introduced products such as ChaosGPT and Cards Against AI. Here, we introduce another system to collect AI in reality A product of the AI incident database of injury/near injury cases caused in the world.

Although there are currently no use cases that have attracted much attention, perhaps on the way to explore human-computer symbiosis, there will be a place for such products to come into play.

Link - https://incidentdatabase.ai/

download_image (32).png

Programming Learning

I believe that many people have seen the powerful programming capabilities of LLMs represented by GPT, combined with the high degree of adaptability between the question-and-answer format and educational scenarios. Maybe it is time for us to look forward to the emergence of some new programming education products. The following are two newly released programming education products for C-end users:

Codeamigo

Codeamigo is an interactive programming education product that uses AI to help users learn how to program using AI tools (a bit of a mouthful).

The content taught by Codeamigo is very basic, and the course presentation format is simple and clear, suitable for beginners to use. In addition to the courses, the platform also provides Codesandbox, an HTML-based sandbox environment, where users can practice what they have learned in real time. But Codeamigo does not provide any automatic feedback or scoring system, users must determine their own progress through self-evaluation.

download_image (33).png

Link - https://codeamigo.dev/

Takeoff School

There isn't much information about the Takeoff School other than the "Next Generation Programming School" tagline, but it's written by my favorite AI hacker Mckay Wrigley mentioned in previous Newsletters. At present, we can only find a demo on Replit/Youtube about teaching users how to quickly build AI tools from scratch - a 30-minute course and 21 lines of Python code.

Link - https://www.takeoff.school/

Demo - https://replit.com/@MckayWrigley/Takeoff-School-Your-1st-AI-App

In addition to products for the C-side, among the products released last week, I also found an interesting product aimed at teaching models to "program".

LlamaAcademy

LlamaAcademy is an experimental project. The goal is to teach GPT to read API documents using LLaMA, LoRA and Langchain - but "experimental project" means that the quality of the currently generated code is not stable.

Users can create a Llama model from their API documentation, which can then be hosted on a server and used to write API glue, which works like this:

download_image (34).png

Link - https://github.com/danielgross/LlamaAcademy

download_image (35).png

AI Agents

Auto-GPT GUI

The GUI of Auto-GPT has opened a waitlist. You can register below

Link - https://news.agpt.co/

MULTI ON plugin by MULTI ON

In February of this year, I started using MULTI ON - before plugins and proxies. This AI-powered tool was already pretty cool (and a little scary) for automating many tasks on my laptop.

Now MULTI ON has announced the development of a ChatGPT plugin, and based on the demo, it looks very powerful - if OpenAI approves this application (can these plugins be called applications now?!), then it may become a capable Superb personal web browser/task runner, maybe even cooler if combined with some of the current AI agents!

Link - https://www.multion.ai/

Demo - https://twitter.com/DivGarg9/status/1648394059483054081

BabyBeeAGI

A buggy, slower but more powerful BabyAGI mod developed by Yohei himself. Specifically, it has stronger task management, dependent tasks, tools, adaptability and integration capabilities, and is suitable for handling more and more complex tasks, but requires higher computing power.

download_image (2).png

Link - https://replit.com/@YoheiNakajima/BabyBeeAGI?v=1

MiniAGI

The smallest general-purpose autonomous agent based on GPT-3.5-Turbo/4, which only retains the simplest and most practical functions, but the disadvantage is that it has no long-term memory (that is, it cannot become a more personalized tool through long-term use), and the tasks that can be performed currently , including but not limited to creating games, analyzing stock prices, conducting cybersecurity tests, creating artwork, summarizing documents, and... ordering pizza.

In addition, MiniAGI can also turn on the critic mode and request additional APIs to improve the accuracy of task completion.

Link - https://github.com/muellerberndt/mini-agi

Embra AI Agents

Create and access AI agents anytime, anywhere with the first AI Agent Center accessible through the Mac App for businesses and individuals.

Link - https://embra.app/

Demo - https://twitter.com/zachtratar/status/1649130015093841921

Height Copilot by Height

Height itself is a project management SaaS startup. Last week, they launched a new product, Height Copilot, which uses AI agents to automate workflow management and help teams build better products.

Link - https://height.app/

Aomni

An AI agent dedicated to information retrieval rather than content generation, capable of finding, extracting, and processing data on the Internet, no API required. Aomni employs the AutoGPT architecture to intelligently plan queries and ensure correct data sources and diverse results.

download_image (3).png

Link - https://www.aomni.com/

AutoPR

Write pull requests independently in response to ChatGPT issues . The author planned a nine-step roadmap for the product, but currently only implements two steps: "automatically write pull requests based on flagged issues" and "autonomous generation through iteration and adaptive planning code".

Link - https://github.com/irgolic/AutoPR

HyperDB

In one sentence: an ultra-fast local vector database for use with AI agents. Specifically, the advantages are as follows:

- Simple interface compatible with all LLM agents.

- Highly optimized C++ backend vector storage with hardware accelerated operations via MKL BLAS.

- Users can index documents with advanced features (such as identifiers and metadata).

Link - https://github.com/Automattic/HyperDB

ThinkGPT by Jina AI

The AI agent that enables LLM to have stronger reasoning and execution capabilities comes from the Chinese entrepreneurial team Jina AI (the author is from Germany).

Its building blocks include: Memory, Self-refinement, Compress knowledge, Inference, and Natural Language Conditions. Its functions mainly include:

- Solving limited context problems with long-term memory and compressed knowledge.

- Enhance the single-shot inference capabilities of LLMs with higher-order inference primitives.

- Add intelligent decision-making capabilities to the code base.

Link - https://github.com/jina-ai/thinkgpt

Gradio-tools

There are thousands of Gradio applications on Hugging Face Spaces, and Gradio-tools is a Python library that converts them into tools that can be further leveraged by LLM-based agents to accomplish tasks.

Currently, Gradio-tools supports LangChain and MiniChain proxy libraries, and comes with a set of pre-built tools, including:

- StableDiffusionTool - Generate images using SD models hosted on Hugging Face space

- ImageCaptionTool - Caption an image by providing a file path

- ImageToMusicTool - creates an audio clip matching the style of a given image file

- StableDiffusionPromptGeneratorTool - Improve hints for SD and other image generators based on this HuggingFace Space

- TextToVideoTool - create short videos from text

- WhisperAudioTranscriptionTool - Transcribe audio with Whisper

- ClipInterrogatorTool - Reverse engineer tips from source images

- DocQueryDocumentAnsweringTool - answer questions from images of documents

- BarkTextToSpeechTool - text to speech

Link - https://gradio.app/

AutoGPT on Hugging Face

As the name suggests, AutoGPT runs on Hugging Face.

Link - https://huggingface.co/spaces/aliabid94/AutoGPT

download_image (4).png

Quick Review

JamGPT

AI Debug Assistant.

download_image (5).png

Link - https://jam.dev/jamgpt

ChatGPT-2D

Use ChatGPT to generate a two-dimensional knowledge graph.

download_image (1).jpeg

Link - https://www.superusapp.com/chatgpt2d/

Motörhead by metal.

Open source memory and information retrieval server for LLM.

Link - https://github.com/getmetal/motorhead

Web Scraping

In the past two weeks, many AI-driven webpage information automatic crawling tools have emerged. Although Gen AI is currently not a mainstream technical solution for webpage crawling, its advantages are obvious. For example, it can better understand and analyze non-structural data to achieve more accurate capture.

Here are three of the more popular AI web scraping gadgets:

✅ Hexomatic - https://hexomatic.com/

✅ WebscrapeAI - https://webscrapeai.com/

✅ Kadoa - https://www.kadoa.com/

Personal Data

✅ Unstructured Data Processing - Bloks

Personal notes, task lists, and meeting minutes are handled automatically.

Link - https://www.bloks.app/

✅ Text Processing - Lettria

Handling of personal text materials.

Link - https://www.lettria.com/

✅ Data Processing - Quadratic

Analyze personal data using AI, Python, SQL, and formulas.

download_image (6).png

For Fun

Glowby Basic

Build an AI voice assistant with its own voice~

Link - https://github.com/glowbom/glowby

Dreamkeeper

Record and understand dreams with the help of AI.

Dreamkeeper uses multiple Gen AI models to make it possible to remember, imagine and retain dreams. Here is the official brief overview:

- In order to remember the user's dreams, a ChatGPT-powered assistant will ask the user some specific questions and make corresponding content adjustments based on the answers;

- A Stable Diffusion model generates an image by extracting keywords from a summary description of the user's dreams generated by ChatGPT;

- This image is transferred to the Tusheng video model to create an animation based on the user's dreams;

- Embed processing with GPT to keep the dreams the user wants to keep in a gallery.

download_image (7).png

Link - https://thedreamkeeper.co/

Awesome movies

The movie search and recommendation platform was developed by Andrej Karpathy. According to Karpathy himself, he built this website in three steps:

Crawls all 11,768 movies since 1970

The synopsis and plot of each movie were scraped from Wikipedia and embedded using OpenAI API (ada-002)

Combined all information into one movie search/recommendation engine website :)

download_image (8).png

Link - https://awesome-movies.life/

V Forsaken Foliage of Farandaya

GPT-4 powered role-playing adventure game about a horror fantasy about 16th century Southeast Asia. There are two difficulties here, one is getting the AI to resolve conflicts (it always tends to defer to the human point of view), and the other is to create a horror theme or combat scenes (LLMs usually refuse to output violent and horrific scenes due to security restrictions).

However, the benefits of GPT-4 are also obvious. The author wrote in the development diary that he himself did not know the Southeast Asian stories of the 16th century, but he was very interested in it—fortunately, GPT-4 has learned relevant knowledge. Therefore, the author used the RPG engine to handle details and resolve conflicts, and used GPT as a "renderer". It took 2 days to complete the construction of the game, and the effect was very good!

download_image (9).png

Link - https://creator.voiceflow.com/prototype/644c47e2d0125e2d5e52ec9b

Artificial Intelligence Radio

Let’s talk about the music industry first: After the release of third-party knock-offs of Drake and Grimes’ own AI-generated productions, the music industry has taken sides with AI, but it’s more of a music-making frenzy — and now , an AI broadcast of purely AI-generated songs that sound so real!

One thing worth noting though: so far, it seems like all the songs are modern hip-hop . Is this a reflection of current trends, or a limitation of the AI's capabilities?

download_image (10).png

Link - https://artificialintelligenceradio.com/

Human or not? By AI21 labs

An interesting little game released by AI21 labs - Chat for two minutes and guess whether the other party is a human or an AI.

download_image (11).png

Link - https://www.humanornot.ai/

Single Prompt AI

A set of single-purpose AI tools that do only one thing-too focused!

download_image (12).png

Link - https://singlepromptai.com/

Go shop with AI

With a celebrity shopping assistant who understands your personal style, there are currently seven celebrities of different ages and styles, including Princess Diana, John F. Kennedy, Pharrell Williams, Justin Bieber, Kim Kardashian, Lenny Kravitz and Anna Wintour, including four men and three women.

download_image (13).png

Link - https://goshopwith.ai/chat

Neural Frames

Generate AI animations for everyone. Honestly, the effect is still too weird.

Link - https://www.neuralframes.com/

Logic Error Detecto

A Chrome extension that automatically detects and highlights logical errors in Twitter - no longer afraid of being taken away by netizens!

Link - https://fallacy.review/

download_image (14).png

From Big

Microsoft

Bing Chat

On May 5th, Bing Chat is officially fully open!

Along with being fully open comes a general uptick in functionality. For example, the ability to process images and video, have plugin capabilities (so users might be able to make it make restaurant reservations or shops), and conversations with chatbots will be stored in the user's own history recording.

Link - https://www.bing.com/new

Designer

Microsoft is taking on Adobe with Designer, a Canva-like canvas-like web app where users can use Gen AI to design anything from posters and presentations to social media posts, and fine-tune the resulting work Dimensioned to match the style of some specific platforms, such as Instagram's square. Trial experience: It can be used, but there is not much productivity improvement.

Link - https://designer.microsoft.com/

Edge

The much-neglected Edge browser is also quietly improving — the current browser interface has a new sidebar that allows users to complete web-side actions under the guidance of AI, such as posting on social media or composing emails.

Link - https://www.microsoft.com/en-us/edge

Athena (chip)

Since 2019, Microsoft has been secretly designing Athena, an LLM-specific chip, which is currently only provided to a small number of Microsoft and OpenAI employees for testing, and is expected to be officially supplied to these two companies next year-but sorry, others It's over!

Link - https://www.theinformation.com/articles/microsoft-readies-ai-chip-as-machine-learning-costs-surge?rc=cvc4po

Google

"Nobody can kill Google Search because we're disrupting ourselves." Here's what Google has been up to lately:

Bard Updates

In the past half a month, Bard has quietly made two small updates:

- On May 5th, Google opened up Bard access to Workspace users. More grounded, corporate Google Docs users can directly use Bard's auxiliary documents to work——Bard did not invite AI in natural scenarios when it was first launched. Workplace users use it, surprisingly.

- On April 21st, Bard finally learned to write code, supporting more than 20 programming languages. If the generated code is Python, you can also export the test directly to Colab. Although it is not yet fully put into production, we can expect its progress!

Link - https://bard.google.com/updates

Google DeepMind

The establishment of Google DeepMind is undoubtedly one of the most high-profile events in the near future (not sure whether the Google Brain team is satisfied with the name of the new department). This is undoubtedly another strong head-on competition launched by Google to the Microsoft+OpenAI combination. Perhaps teams will focus on integrating language models into their search engines.

Link - https://www.deepmind.com/

Magi

Google, too, has other teams working to add more functionality to the traditional search engine, and has launched a project called Magi, which includes products such as image generation, direct financial transactions within search, clear and accurate answers, and new ad listings. Here's a brief list of products that Google is considering for release, although the team says not all of them will be launched (as always):

- GIFI - generate images in image search results

- Google Earth + AI - Mapping and Exploration

- Tivoli Tutor - language learning app powered by AI

- Search Along - Synchronize chat with search results

- Search music through chat

Link - https://neilpatel.com/blog/project-magi/

New Search Engine

At the same time, Google is also designing a brand new search engine that is completely different from the traditional search experience. But apart from the tagline "new A.I. technology in phones and homes all over the world," there isn't much information about the engine. Of course, it may also be "competing" with Samsung, because Samsung said that it will make Bing the default search engine on its devices-just kidding, Google will not give up its best products easily.

Link - https://www.nytimes.com/2023/04/16/technology/google-search-engine-ai.html

Sec-PaLM

Sec-PaLM is an LLM designed to keep users safe from ransomware and spyware, simplifying the job of enterprise security managers while allowing them to do things that only cybersecurity experts can do.

Sec-PaLM is vertically trained on a data set containing billions of security events for problems related to vulnerabilities, malware, threat indicators, and malicious agent files through Mandiant (a cybersecurity company in the United States), and it will be integrated into various security tools of Google and provide services to users in the form of chatbots. In a press release, Google assured that Sec-PaLM can upgrade ordinary security operations personnel to level-one security operations experts.

So how does the model work? It can be roughly divided into two steps:

- First, when a business is attacked, security applications generate reports that contain large amounts of technical data, but the data is not easily understood. In the Security Command Center, Sec-PaLM analyzes these reports and creates a summary that explains in plain language what is happening and generates charts and graphs for more visual viewing.

- Second, Sec-PaLM not only provides advice, but also takes action. Users can trigger automated attack-based barriers to block attacks and write specific code to protect critical content on the corporate network.

In summary, security operators can focus on threat analysis without wasting time on tedious operations.

Link - https://cloud.google.com/blog/products/identity-security/rsa-google-cloud-security-ai-workbench-generative-ai

Max Text

A scalable, high-performance open-source LLM written in pure Python/Jax.

Link - https://github.com/google/maxtext

Quartz by Apple

Quartz is an AI-paid health management software that Apple is developing (the company calls it a health coach). It can use AI and Apple Watch data to formulate personalized recommendations for specific users and create health guidance plans to help users improve exercise and sleep. and eating habits — Quartz, however, won't be released anytime soon, but is planned for next year.

Link - https://www.bloomberg.com/news/articles/2023-04-25/apple-aapl-developing-ai-health-coaching-service-ipados-17-health-app

By the way, at present, Apple’s initiatives in this AI wave are too limited-the last time I heard about them was 9to5Mac’s report on the tvOS chatbot product code-named Bobcat at the end of March. Information reporter Wayne Ma published an article on April 27, Apple’s AI Chief Struggles With Turf Wars as New Era Begins, detailing how organizational dysfunction and a lack of ambition have bogged down Apple’s AI and ML efforts:

Link - https://www.theinformation.com/articles/apples-siri-chief-struggles-as-new-ai-era-begins

NeMo Guardrails by NVIDIA

NVIDIA has launched an open-source software called NeMo Guardrails that “puts a guardrail” around LLMs, helping developers guide text-based Gen AI applications to generate accurate, appropriate, topic-relevant, and most importantly, safe content.

NeMo Guardrails provides three types of boundary settings:

- Topic guardrails - ensuring that content generated by LLMs is relevant to user needs

- Dialogue safety guardrails - ensuring LLMs generate content that is correct and objective

- Security guardrails - protect LLMs from external malicious attacks

The software includes code, examples, and documentation that businesses can use to add security to text-based Gen AI applications, and is open source and compatible with all tools.

download_image (15).png

Link - https://blogs.nvidia.com/blog/2023/04/25/ai-chatbot-guardrails-nemo/

AIP by Palantir

Palantir demonstrated its product, the Palantir AI Platform (AIP), which guides troops in attack planning in a war zone and uses Gen AI to choose the best weapon — probably not the best option given the current state of AI security and alignment.

While the technology itself is cool, and certainly useful —the customers can basically run the Gen AI platform on private networks in a secure manner, and with data protection, a hot topic, this technology certainly deserves our attention — it could also be the most unpopular demo of all time.

Link - https://www.palantir.com/platforms/aip/

GigaChat by Sberbank

It seems we haven't heard the voice of Russia in this AI wave, and here it is!

Last week, Sberbank, a major Russian bank that has been investing heavily in technology over the past few years to get rid of the country's dependence on imported products and technology, released their ChatGPT-like product GigaChat, which focuses on "smarter Russian conversations" and is currently inviting testing Stage, friends who understand Russian, come and try it!

Link - http://www.sberbank.ru/ru/sberpress/vazhnoe

Total Crap · AI / ML / LLM / Transformer Model Timeline and List

(Click to enlarge, but it is recommended to jump to the source website through the link below the picture to see the interactive picture)

download_image (2).jpeg

Link - https://ai.v-gar.de/ml/transformer/timeline/