- Flannery O'Connor
- me
Artificial neural networks (ANNs) are sometimes described by analogy to biological neurons. However, their mode of operation is not similar to biological brains. It's impossible to deny the impressive success of NN models like chatGPT-3 and 4 or DALL-E-2. Unfortunately, training these models requires very large computing infrastructure available to only a few institutions. chatGPT was trained using Microsoft's Azure cloud system. OpenAI has not disclosed the length of time required to train GPT-3, complicating the researchers’ estimations². However, Microsoft has built supercomputers for AI training and says that its latest supercomputer contains 10,000 graphics cards and over 285,000 processor cores². The process of training ChatGPT involves a combination of data preparation, model design, and machine learning algorithms, and can take several days to several weeks depending on the size and quality of the training data and the computing resources available¹.
The data requirements to train something like chatGPT-3 are equally impressively large. According to GPT Blogs, ChatGPT-3 was trained using datasets of text from the internet, totaling 570GB and 300 billion words. By contrast, the adult human brain weighs on average about 1.5 kg (3.3 lb)². The weight of the brain varies between men and women, with an average weight of about 1370 g in men and 1200 g in women². The brain is about 60% fat¹. For the average adult in a resting state, the brain consumes about 20 percent of the body’s energy⁴. The brain's primary function - processing and transmitting information through electrical signals - is very expensive in terms of energy use⁴. An adult brain uses approximately 110 calories per pound, per day¹. For an individual human, that's not much, but if you multiply by 8 billion humans a lot of calorie consumption is required to keep the world's human brains functioning.
Comparing the human brain to something like chatGPT is a category mistake. Large language models are nothing like brains and are not even close in terms of functionality.
Hyperdimensional computing (HDC)is an emerging learning paradigm that computes with high dimensional binary vectors. It combines very high-dimensional vector spaces (e.g. 10,000 dimensional) with a set of carefully designed operators to perform symbolic computations with large numerical vectors². HDC is attractive because of its energy efficiency and low latency, especially on emerging hardware¹. HDC is based on the observation that key aspects of human memory, perception and cognition can be explained by the mathematical properties of hyperdimensional spaces comprising high-dimensional binary vectors known as hypervectors³.
I got the previous information from Bing's AI chat bot.
HDC is also called Vector-Symbolic Architecture (VSA).
In HDC, high dimension vectors can be used to represent words or concepts. Using simple operations, such as add, multiply, XOR, sign, and majority, these vectors can be combined in order to represent new concepts. The vectors are typically binary (0, 1) or bipolar (-1, 1) and can be stored and manipulated efficiently.
In what follows, I won't try to be efficient either in terms of memory or computing. I'll try to develop a few simple HDC operations and a couple of sample applications.
Hyperdimensional Vectors
def hdc(N: int = 10_000) -> np.ndarray: return np.random.choice([-1, 1], size = N) x = hdv() In [3]: x Out[3]: array([ 1, 1, 1, ..., -1, 1, -1])
High Dimensions Can Be Surprising
x = hdv() y = hdv() sum(x == y) Out[8]: 4906
def cos(x: np.ndarray, y: np.ndarray) -> np.float64: return np.dot(x, y) / (np.linalg.norm(x) * np.linalg.norm(y))
import numpy as np import matplotlib.pyplot as plt from hdc import hdc, cos import seaborn as sns import pandas as pd
# compute a simularity matrix def calc_hdv_distances(num_dimensions: int, num_samples: int = 1000) -> np.ndarray: hdvs = np.zeros(num_samples, dtype = object) for i in range(num_samples): hdvs[i] = hdv(N = num_dimensions) dist = np.zeros((num_samples, num_samples)) for i in range(num_samples-1): dist[i, i] = 1 for j in range(i+1, num_samples): dist[i, j] = cos(hdvs[i], hdvs[j]) dist[j, i] = dist[i, j] return dist def main(): dist10 = calc_hdv_distances(10) # 10 dim vectors dist10_000 = calc_hdv_distances(10_000) # 10,000 dim vectors # plot off diagonal elements df = pd.DataFrame({"dim: 10": np.ndarray.flatten(dist10), "dim: 10_000": np.ndarray.flatten(dist10_000)}) sns.histplot(data = df, bins = 30) plt.xlabel("cos") plt.show() sns.heatmap(dist10) plt.title("dim: 10") plt.show() sns.heatmap(dist10_000) plt.title("dim: 10_000") plt.show() if __name__ == "__main__": main()
Operations on Hyperdimensional Vectors
def bundle(x: np.ndarray, y: np.ndarray) -> np.ndarray: return np.sign(x + y) def bind(x: np.ndarray, y: np.ndarray) -> np.ndarray: return x * y def shift(x: np.ndarray, k: int = 1) -> np.ndarray: return np.roll(x, k)
x = hdv() y = hdv() z = hdv() all(bind(x, bundle(y, z)) == bundle(bind(x, y), bind(x, z))) Out[12]: True cos(x, y) == cos(bind(x, z), bind(y, z)) Out[15]: True cos(x, y) == cos(shift(x), shift(y)) Out[16]: True
What is the Dollar of Mexico?
U = hdv() # USA M = hdv() # Mexico D = hdv() # dollar P = hdv() # peso X = hdv() # country Y = hdv() # currency # records for US and Mexico combining the # individual currency and country A = bundle(bind(X, U), bind(Y, D)) # US B = bundle(bind(X, M), bind(Y, P)) # Mexico
# bind(D, A) = # bundle(bind(D, bind(X, U)), bind(D, bind(Y, D))) = # bundle(bind(D, bind(X, U)), Y) ~ # Y dollar_role_us = bind(D, A) print(cos(dollar_role_us, Y)) # dollar is currency of US 0.7107038764492565
# bind(dollar_role_us, B) ~ P dollar_of_mexico = bind(dollar_role_us, B) print(cos(dollar_of_mexico, P)) # peso is dollar of mexico 0.5000999900019995 print(cos(dollar_of_mexico, D)) # dollar is not currency of Mexico -0.014197160851716099
Protein Classification
amino_acids = ['A', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'K', 'L', 'M', 'N', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'Y']
def trimer_hdv(amino_acids: list[str], N: int = 10_000) -> dict[str, np.ndarray]: trimer_hdvs = dict() for aa1 in amino_acids: for aa2 in amino_acids: for aa3 in amino_acids: trimer_hdvs[aa1 + aa2 + aa3] = hdv(N = N) return trimer_hdvs trimer_hdvs = trimer_hdv(amino_acids)
def embed_sequences(seqs: list[SeqRecord], trimer_hdvs: dict[str, np.ndarray], N: int = 10_000) -> tuple[np.ndarray, np.ndarray]: hdvs = np.zeros((len(seqs), N)) seq_types = [] for idx, seq in enumerate(seqs):
# bundle for pos in range(len(seq) - 2): hdvs[idx, :] += trimer_hdvs[seqs[idx].seq[pos:(pos+3)]] hdvs[idx, :] = np.sign(hdvs[idx, :])
if seqs[idx].id.find("HUMAN") != -1: seq_types.append("HUMAN") else: seq_types.append("YEAST") return hdvs, np.array(seq_types, dtype = str)
hdvs, seq_types = embed_sequences(seq_recs, trimer_hdvs) # seq_recs is a list of BioPython SeqRecords
def get_training_test_index(length: int, training_pct: float = 0.8) -> tuple[list, list]: idx = np.random.choice(length, size = int(np.rint(training_pct * length)), replace = False) mask=np.full(length, True, dtype = bool) mask[idx] = False ids = np.array(list(range(length)), dtype = int) return list(ids[~mask]), list(ids[mask]) training_idx, test_idx = get_training_test_index(hdvs.shape[0])
def prototype(hdvs: np.ndarray, idx: list) -> np.ndarray: return np.sign(np.sum(hdvs[idx], axis = 0)) human_training_idx = [i for i in training_idx if seq_types[i] == "HUMAN"] human_prototype = prototype(hdvs, human_training_idx) yeast_training_idx = [i for i in training_idx if seq_types[i] == "YEAST"] yeast_prototype = prototype(hdvs, yeast_training_idx)
def prediction(human_prototype: np.ndarray, yeast_prototype: np.ndarray, x: np.ndarray) -> str: return "HUMAN" if cos(human_prototype, x) > cos(yeast_prototype, x) else "YEAST" predictions = np.array([prediction(human_prototype, yeast_prototype, x) for x in hdvs[test_idx]], dtype = str) print(np.mean(predictions == seq_types[test_idx]))
$ time python src/proteins_hdc.py 0.845 real 0m7.568s user 0m7.319s sys 0m0.270s
====================================================
Source: Conversation with Bing, 4/27/2023
(1) An Introduction to Hyperdimensional Computing for Robotics. https://link.springer.com/article/10.1007/s13218-019-00623-z.
(2) [2202.04805] Understanding Hyperdimensional Computing for Parallel .... https://arxiv.org/abs/2202.04805.
(3) In-memory hyperdimensional computing | Nature Electronics. https://www.nature.com/articles/s41928-020-0410-3.
(4) Fulfilling Brain-inspired Hyperdimensional Computing with In ... - IBM. https://www.ibm.com/blogs/research/2020/06/in-memory-hyperdimensional-computing/.
=====================================================
Source: Conversation with Bing, 4/27/2023
(1) How Much Energy Does the Brain Use? - BrainFacts. https://www.brainfacts.org/Brain-Anatomy-and-Function/Anatomy/2019/How-Much-Energy-Does-the-Brain-Use-020119.
(2) How many calories does the brain consume? | Calories - Sharecare. https://www.sharecare.com/health/calories/brain-calories-at-rest.
(3) Power of a Human Brain - The Physics Factbook - hypertext-book. https://hypertextbook.com/facts/2001/JacquelineLing.shtml.
(4) Does Thinking Really Hard Burn More Calories? - Scientific American. https://www.scientificamerican.com/article/thinking-hard-calories/.
(5) We finally know why the brain uses so much energy. https://www.livescience.com/why-does-the-brain-use-so-much-energy.
=====================================================
Source: Conversation with Bing, 4/27/2023
(1) Brain size - Wikipedia. https://en.wikipedia.org/wiki/Brain_size.
(2) Brain Anatomy and How the Brain Works | Johns Hopkins Medicine. https://www.hopkinsmedicine.org/health/conditions-and-diseases/anatomy-of-the-brain.
(3) How Big Is a Human Brain? Brain Size and Brain Weight - Verywell Mind. https://www.verywellmind.com/how-big-is-the-brain-2794888.
(4) Human brain - Wikipedia. https://en.wikipedia.org/wiki/Human_brain.
=====================================================
Source: Conversation with Bing, 4/27/2023
(1) Training ChatGPT AI Required 185,000 Gallons of Water: Study. https://gizmodo.com/chatgpt-ai-water-185000-gallons-training-nuclear-1850324249.
(2) What are GPT-3 and ChatGPT by OpenAI? How do they work?. https://blog.illacloud.com/what-are-gpt-3-and-chatgpt-by-openai-how-do-they-work/.
(3) Introducing ChatGPT - openai.com. https://openai.com/blog/chatgpt.
(4) How many days did it take to train GPT-3? Is training a neural ... - Reddit. https://www.reddit.com/r/GPT3/comments/p1xf10/how_many_days_did_it_take_to_train_gpt3_is/.
(5) Product - OpenAI. https://openai.com/product.
No comments:
Post a Comment