Artificial Creative Intelligence; Breaking the Imitation Barrier

Artificial Creative Intelligence; Breaking the Imitation Barrier (PDF)

2022 • 7 Pages • 806.99 KB • English
Posted July 01, 2022 • Submitted by Superman

Visit PDF download

Download PDF To download page

Summary of Artificial Creative Intelligence; Breaking the Imitation Barrier

Artificial Creative Intelligence: Breaking the Imitation Barrier Rowland Chen, Roger B. Dannenberg, Bhiksha Raj, and Rita Singh School of Computer Science and Language Technologies Institute Carnegie Mellon University Pittsburgh, Pennsylvania 15213 USA [email protected] [email protected] [email protected] [email protected] Abstract Not all knowledge is created equal. A hierarchical archi- tecture is a method to classify knowledge for use in the field of human cognition and computational creativity. This paper introduces an Insight-Knowledge Object (IKO) model as a framework for Artificial Creative In- telligence (ACI), a step forward in the pursuit of replicat- ing general human intelligence with computing machin- ery. To achieve ACI, it is hypothesized that a fundamen- tal rethinking of the architecture of human cognition and knowledge processing is required. One possible novel architecture could be the IKO model. The authors in- clude a description of on-going work at Carnegie Mellon University that applies the IKO model in practice with an artificial music improvisation embodiment. Introduction Since the dawn of computing machinery (Jacquard’s first use of wooden punch cards for looms to imitate and weave fabric designs in 1801, from Zimmerman 2017), the chal- lenge of achieving true machine creativity remains elusive (IBM 2019; Olenik 2019). Creativity has been defined as an idea that is novel, surprising, and valuable (Boden 2004; Abraham 2018; Runcon 2012; Stein 1953). Recent advances in artificial intelligence, machine learning in particular, ap- proach the successful imitation of the styles of select paint- ers through style transfer in the visual arts (Zhou, Z. et al., 2019). Advances have also been achieved in voice imper- sonation (Andabi 2017; Gao 2018). However, the authors assert true creations do not emerge from style transfer. Noth- ing truly novel and no surprises result from imitation. Simi- lar to Turing’s question from 1949, “Can machines think?” (Turing 1950) the authors approach their ultimate question, “Can machines create?” A fundamental re-thinking of human and artificial crea- tive processes and corresponding mathematics and compu- ting machinery are indicated to break through an imitation barrier. This paper introduces an Insight-Knowledge Object (IKO) model as a framework for human creativity and Artificial Creative Intelligence (ACI), a step forward in the pursuit of replicating general human intelligence with com- puting machinery. The IKO model represents a novel ap- proach to computational creativity and computing machine design. Of importance to consider are the ethical and moral impli- cations of developing a machine that can create new thoughts, ideas, plans of action, conversations, science works, and works of art as would a human. While one clear answer does not exist, embedded in the question facing sci- entists and engineers, “Should machines create?” are issues and concerns of ethics and morality. The authors hope the Insight-Knowledge Object model proves useful to those advancing the science of human in- telligence and the research driving towards computational creativity. Insight-Knowledge Object Model A proposed Artificial Creative Intelligence framework, the Insight-Knowledge Object (IKO) model (Figure 1), builds on Chen’s prior work (Chen 2009). Proposed here is a way of modeling the human thought process that can be used to shape the development of ACI machines. It is im- portant that each level in the IKO hierarchy be present in the process of human thinking as well as in artificial attempts at reproducing that process. Knowledge objects are on the left- hand side, and insight processes appear on the right. In the IKO model, as insight processes and knowledge objects lad- der up through the hierarchy, increasing levels of cognitive sophistication are reached. The IKO model has eleven levels of knowledge objects and ten levels of insight processes act- ing upon the knowledge objects. The IKO model begins with a state of knowledge object known as void. In this state, even the acknowledgement of nothing does not exist. At the very top of the IKO model is the knowledge object creation generated by an inspirational insight process. Descriptions of Each Insight Process Instinctual insight acts on the void to generate null, the Proceedings of the 11th International Conference on Computational Creativity (ICCC’20) ISBN: 978-989-54160-2-8 319 Figure 1: Insight-Knowledge Object model representing a hierar- chical architecture for human cognition. first true layer of knowledge. Null emerges from void as pri- mal instincts create an awareness of one’s environment. This level of knowledge is deemed null because at this level, there is at least consciousness of existence or non-existence. In the void, even consciousness does not exist. Definitional insight acts on the null to elevate knowledge into data. At this level, the insight process gives definition to objects and actions in the null. Definitional insight labels a collection of unnamed and unidentified things so that dis- tinctions are drawn between them. Each object is now de- fined and becomes a datum. Contextual insight acts on data to generate facts in the next layer of the hierarchy. Facts represent a richer and fuller set of knowledge than pure data. For example, if one takes the word coffee as a datum there is no context for reference. Given some context such as the commodities trading mar- ket, coffee takes on the meaning of a traded good. If food service is the context, then coffee takes on the meaning of a beverage. Contextual insight allows distinctions to be made between data to create different facts. Utilitarian insight acts on facts to generate know-how, how an object is used and for what purpose. In our coffee example, utilitarian insight emerges to provide the know- how for what to do with coffee. In the commodities market context, know-how would be how to trade coffee on the spot or futures markets. In the food service context, know-how would be how to prepare coffee for consumption. Without utilitarian insight, coffee has no real value. Simply speaking, utilitarian insight provides knowledge of use. Experiential insight acts on know-how to generate mem- ories. The execution of know-how generates experiences that can be remembered and used in the future. Following the coffee example, experience in making coffee enables a barista to remember how much foam to put on top of a latte. Reflective insight acts on memories to generate wisdom. Reflection works on a meta-plane of thinking and takes on a new layer of abstraction in the knowledge hierarchy. In- sights are not simply generated on single points of execution but a set of memories. For example, remembering how to make a latte is a memory but digging deep to understand why people order lattes takes reflection. Wisdom emerges as a person can take a step back to reflect and learn from prior thoughts, decisions, and actions. Recognitional insights act on wisdom to generate pat- terns. This insight function is in the realm of data science and data analytics. Recognizing patterns links related or un- related pieces of wisdom to generate knowledge that would not emerge otherwise. For example, connecting the prepara- tion of a perfect latte to the film “Seven Samurai” (Kuro- sawa 1954) in which one of the samurai has dedicated his whole life to perfect his skills as a swordsman represents connecting two topics that on the surface are completely un- related. A pattern emerges with the recognition that these humans continually strive for mastery in their respective fields. Recognitional insights create connections that drive thinking further and produce patterns. Extrapolative insights act on patterns to generate predic- tions. This insight function is in the realm of statistics and probability. Making predictions links related or unrelated patterns to generate knowledge that would not emerge oth- erwise. Extrapolative insights produce predictions used for weather forecasting, social media user preference, and the serving up of relevant advertising. Again, in our coffee ex- ample, extrapolative insight is required to predict the de- mand for lattes in a coffee shop during the course of a day. Comparative insights act on predictions to generate imi- tations. This insight function is in the realm of machine learning and implementations such as generative adversarial networks (GANs). Predictions are compared against a refer- ence with the goal of achieving an imitation that most closely matches the reference. Much work and many exam- ples of painter style matching, voice impersonation, and lit- erature already exist. This level of the knowledge hierarchy is the present boundary of today’s state-of-the-art deep learning techniques. One could surmise researchers and sci- entists have hit an imitation barrier. For lattes, skilled Proceedings of the 11th International Conference on Computational Creativity (ICCC’20) ISBN: 978-989-54160-2-8 320 baristas can copy fanciful milk designs on the surface of lat- tes based on prior examples of a master barista’s work. This is latte style transfer. Inspirational insights act on imitations to generate crea- tions. This insight function has not yet been designed and requires fundamental research across the disciplines of neu- roscience, physiology, psychology, and computer science. The proposed ACI framework surpasses the imitation level by using unique, to-be-developed computing machines and computational algorithms, simulating human inspirational insights, to output creations. A master barista thinks up and executes unique and novel designs for lattes (Figure 2). She is not simply mimicking prior art. The inspiration process of the proposed IKO model breaks through the imitation bar- rier. At this level of the IKO hierarchy, the challenge of Ar- tificial Creative Intelligence could be solved. Figure 2: Inspired latte design (Photo credit: Related Work – the DIKW Pyramid In information management and business technology, knowledge management was a new approach for data stor- age and data processing that was introduced late last cen- tury. A hierarchical model was developed at the time to cap- ture, store and retrieve documents (origin unknown, Sharma, 2008). Data, information, knowledge, and wisdom comprise the model, also known as the DIKW pyramid (Zeleny, 2005; Rowley, 2006). In the knowledge manage- ment field of information technology, the DIKW pyramid has not evolved since its introduction (Elliott 1934). A col- lateral benefit of the new IKO model, primarily intended for artificial creative intelligence applications, is a potential new methodology for knowledge management in business information management and information technology. The IKO Model as Functions The increasing levels of sophistication in the IKO model can be written as a set of functions in which insight pro- cesses act upon knowledge objects (variables) of the form: Knowledge object = Insight function acting on the next lower order of knowledge object KOi = f(KOi-1) The increasing levels of sophistication in the IKO model are symbolically presented below where a function is an in- sight process. The exact nature of the functions has not yet been discovered. The use of the “fi (KO)” notation is in- tended to drive the thinking of those seeking ways to repre- sent creativity and perhaps, to provide a way to develop the mathematics of creativity. void = initial condition null = f1 (void) datum = f2 (null) fact = f3 (datum) know-how = f4 (fact) memory = f5 (know-how) wisdom = f6 (memory) pattern = f7 (wisdom) prediction = f8 (pattern) imitation = f9 (prediction) creation = f10 (imitation) Therefore, creation is a function of imitation, prediction, pattern, wisdom, memory, know-how, fact, datum, null, and void. The authors postulate the system function, F, relies on ten nested, or composite, functions. If any function is skipped, the system collapses, and output falls short of a creation. F can thus be written as: F = f10 ∘ f9 ∘ f8 ∘ f7 ∘ f6 ∘ f5 ∘ f4 ∘ f3 ∘ f2 ∘ f1 Breaking the Imitation Barrier “Deep learning and current AI, if you are really honest, has a lot of limitations.” Jerome Pesenti, Vice President of Artificial Intelligence, Facebook (December 4, 2019) “Creativity and where the authors started exploring with the (Disney movie trailer) is fascinating because deep learning isn’t the answer to creativity.” John Smith, IBM Fellow at IBM Research (2019) Current attempts at computational imitation rely on ma- chine learning techniques that use algorithms acting on training and testing data in a resource-intensive approach. In the context of the proposed IKO model, generative adver- sarial networks (GANs) produce imitations (level ten) act- ing directly on data (level three), thus skipping several lev- els. But imitations are not creations. The authors hypothe- size the existence of an imitation barrier that limits current approaches to level ten of the model (Figure 3). The authors also hypothesize that as a result of level skipping in the con- text of the IKO framework, attempts at artificial creativity Proceedings of the 11th International Conference on Computational Creativity (ICCC’20) ISBN: 978-989-54160-2-8 321 will not be able to break through an imitation barrier to reach creations. Figure 3: The Imitation Barrier Computational creativity occurs in natural brain function, and the proposed IKO hierarchy suggests a model for human thought processes. When a human creates, inspiration does not happen in a vacuum. All knowledge in the brain comes to play. For example, facts, know-how, memories, and wis- dom are needed but are left out of current imitative compu- ting. Refer to the system function, F, that is a composite of all insight functions. If any function is skipped, the system collapses. Figure 4: House of cards collapsing (Photo credit: In order to validate or refute the level-skipping failure hy- pothesis, additional research and exploration are required, using an interdisciplinary approach across the fields of mathematics, computer science, computer engineering, neu- roscience, physiology, psychology, and perhaps, philoso- phy. JerryBot: the IKO Model in Practice In various music genres, creative improvisation on a sin- gle or multiple instruments delivers music that is pleasing to the listener. The music genres include rock, jazz, jazz fusion, classical, blues, rhythm & blues, bluegrass, Indian, and oth- ers. The improvisation technique of one member, or more members, of a musical group creates new, real-time compo- sitions during a live performance or in a music studio. Music improvisation is an example of human creativity – novel, surprising, and valuable (Boden 2004; Abraham 2018; Run- con 2012; Stein 1953). The sounds created are so valuable (pleasing) that a music artist develops a fan base comprised of music lovers around the world. The improvised music is created through inspirational insights based on the intelli- gence, experience, and technique of the music artist or art- ists. When a highly skilled and highly loved music artist passes away, his or her new improvisational music creations and corresponding fan experiences disappear forever. An Artificial Creative Intelligence machine fills the emptiness left behind by a dead artist by recreating the improvisation style and reproducing the instrument sounds of the specific artist. The music fan’s pleasurable listening experience with the late artist and his or her band is extended. The goal of the machine is to perform alongside human musicians in real-time, and this hybrid human-artificial band’s music is indistinguishable from any new music that could have been performed if the artist were not dead. One can call this the creation game, with apologies to Turing (Turing, 1950). A version of an artificial music improvisation machine design appears in Figure 5. In this system, called JerryBot (Chen 2020), multiple liv- ing musicians improvise with the ACI machine, similar to a completely human band. On-going work at Carnegie Mellon University focuses on a specific musician, the late guitarist Jerry Garcia of the rock group The Grateful Dead. A data- base containing recordings of all live concert performances of the Grateful Dead has been captured from the Internet Ar- chive (Internet Archive, 2019). These data, which the au- thors call DeadNET (Figure 6), represent nearly 2,500 con- certs, performed between 1965 and 1995, and approxi- mately 7,500 hours of music. The data comprise the Histor- ical Variations of Song. For example, within DeadNET, there are over six hundred variations of the song “Playing in the Band.” In the block diagram, one can see recognitional, extrapo- lative, comparative, and inspirational insight functions high- lighted in yellow. These four insight functions are integrated into a single system in an attempt at developing an ACI ma- chine. Additionally, embedded in the recordings captured in DeadNET are the band’s musical data, facts, know-how, memories, and wisdom accumulated over 30 years of per- Imitation Prediction Pattern Current boundary of AI Creation Breakthrough the boundary with Machine Inspiration Proceedings of the 11th International Conference on Computational Creativity (ICCC’20) ISBN: 978-989-54160-2-8 322 Figure 5: Artificial Creative Intelligence machine: JerryBot design (Chen, 2020) formance. With DeadNET and the insight functions of recognition, extrapolation, comparison, and inspiration, the ACI machine under development includes all levels of knowledge object and insight function. No levels are skipped. The machine holistically brings together a whole- brain approach to computational creativity. Figure 6: DeadNET extracted from the Grateful Dead collection of concert recordings on the Internet Archive. The authors have adopted music as a language paradigm to leverage existing solutions and tools developed in the Language Technologies Institute at Carnegie Mellon Uni- versity as well as open-source software. In process are the use of a “bag of words” model and vector quantization of .wav files (Figure 7, Raj et al. 2019). Continuing work is intended to address the other components of the artificial music improvisation machine, for example, the isolation of Jerry Garcia’s lead guitar track from the recordings of the entire band to create JerryNET perhaps modeled after the neural basis of auditory attention used for speaker isolation (Geravanchizadeh 2020), discovering the neural basis of in- spirational insight, modeling the music conversations that occur when a jam band assembles and plays, and formulat- ing how a machine can self-assess the value of com- putationally-created music (pleasantness of the output from a hybrid human-machine band). Much work remains. Figure 7: Log mel spectrogram from work-in-process (Raj, B.; Agarwal, S.; and Raj, T. 2019) Other Use Cases Beyond music, other potential applications for Artificial Creative Intelligence include use cases in language arts (the haiku problem) and autonomous vehicles (control and deci- sion-making in unlearned situations). Proceedings of the 11th International Conference on Computational Creativity (ICCC’20) ISBN: 978-989-54160-2-8 323 The Haiku Problem 5-7-5. That is the syllable count for the Japanese three- line poem style known as haiku. In haiku, the first two lines set the tone of the poem and the emotions of the reader. The third line is a surprising closure. For example, below is a haiku by Edo-period (18th century A.D.) Japanese artist Katsushika Hokusai: I write, erase, write, Erase, re-write again, then A red poppy blooms The final line of the haiku is a surprise and an emotionally satisfying ending created by Hokusai in a moment of inspi- ration. The third line also diverts the reader from extrapola- tion and a prediction of what might conclude the poem. The haiku problem is for a machine to generate a third line that is both surprising and pleasing to the reader. At- tempts at machine-written haiku have been able to achieve the correct syllable count (for example, in Aguiar 2019). However, further work is required to achieve an elusive novel close. The No-Win Self-Driving Car Problem A continuous flow of conscious choices exists in driving situations. Even in relatively stable conditions, for example, cruising down the highway, humans choose speed, acceler- ation/deceleration, and direction. Add in entertainment, and the choices grow. Currently, autonomous vehicles integrate robotics, image recognition, comparative insight, and deep learning. However, a vehicle’s response to turbulence intro- duced into the environment is limited to learned scenarios. But what about a situation unanticipated and unlearned by the autonomous systems? An example of this would be a scenario in which multiple externalities are introduced into the moving vehicle environ- ment to create a no-win situation. How would a self-driving car, with a passenger, make a decision between crashing into an 80-year old man, a pregnant woman, or a toddler or driv- ing off a cliff (there being no other alternatives)? The con- troller of the vehicle has several choices to make, be the con- troller a human or an AI system. As Nyholm and Smids (Ny- holm 2016) have written, this problem is being addressed in the development of accident algorithms, and it is not analo- gous to the trolley problem that comes from the study of eth- ics. The authors of this paper contend that currently, only a human can make an acceptable conscious choice in this no- win scenario, where acceptability might differ from culture to culture. The authors are also cognizant of the risk of being caught in a prisoner’s dilemma in pursuit of an artificial so- lution. An Artificial Creative Intelligence machine, devel- oped for music improvisation, could provide the level of computational creativity required for making real-time choices to minimize physical, emotional, ethical, and socie- tal damages. Conclusion Artificial Creative Intelligence requires a fundamental re- thinking of how knowledge is managed and how intelli- gence is processed in vivo. The Insight-Knowledge Object model is offered as a new framework for scientists and en- gineers working in the fields of computer science, neurosci- ence, psychology, etc. Each level of knowledge object builds on the level beneath it. In order to continue up the hierarchy, knowledge must be captured and modeled differ- ently from current and prior attempts at artificial creativity. In the IKO model, creativity is dependent on all levels of knowledge that come before it: void, null, datum, facts, know-how, memories, wisdom, patterns, predictions, and imitations. If any of these necessary levels are skipped, then the creative process collapses much like a house of cards, and Artificial Creative Intelligence remains out of reach. Still unknown are the mathematics and implementations required to execute Artificial Creative Intelligence based on the proposed IKO architecture. The authors recognize blurry lines exist between levels that might direct future work in the direction of analog computing. To be determined is the exact nature of the insight functions. Are matrix algebra, the calculus, statistics, and probability sufficient to push the cur- rent AI boundary into creative insight processing? If not, what mathematics is needed for a software solution? How will quantum computing impact AI? Will an analog network of solid-state neurons play a role? Additionally, the biolog- ical basis for the IKO model has yet to be explored to under- stand the natural science foundation of the insight functions. The issue of speed also requires solving. For ACI to be truly effective in numerous use cases, response times must match or improve upon that of human processing. To be addressed as well is training an Artificial Creative Intelligence machine if deep neural networks are incorpo- rated. Unsupervised training is particularly challenging in artificially judging pleasantness of newly generated music. Baker et al. (Baker, 2020) have launched an initiative in Hu- man-Assisted Training-AI as a direction for future work in machine learning. The above questions and issues require continued re- search and exploration across multiple disciplines including work in neuroscience, psychology, mathematics, computer science, electrical engineering, and philosophy. The challenge is how to break the imitation barrier if at all possible. Acknowledgements In-kind support for this work was provided by Audio Intel- ligenzia. The authors give thanks to the independent, anon- ymous reviewers of this paper for their feedback and sug- gestions. Proceedings of the 11th International Conference on Computational Creativity (ICCC’20) ISBN: 978-989-54160-2-8 324 References Abraham, A. 2018. The Neuroscience of Creativity, Cam- bridge Fundamentals of Neuroscience in Psychology, Cam- bridge, U.K.: Cambridge University Press. Aguiar, R., and Liao. K. 2019. Autonomous haiku genera- tion, Andabi, Soundcloud 2017. Voice style transfer to Kate Winslet with DNNs, dabi/sets/voice-style-transfer-to-kate-winslet-with-deep- neural-networks. Baker, J. K.; Baker, B. J.; Huang, X.; Reddy, R.; Mitchell, T.; Garibay, I., Raj, B.; and Singh, R.; Georgiopoulos, M. 2020. An introduction to human-assisted training for artifi- cial intelligence. Boden, M. 2004. The creative mind: Myths and mechanisms (2nd edition). London: Routledge. Chen, R. 2009. Knowledge architecture. White paper, Audio Intelligenzia, San Jose, CA. Chen, R. 2020. Artificial inspiration machine. Patent appli- cation filed with United States Patent & Trademark Office. Elliot, T.S. 1934. The rock. London: Faber & Faber Gao, Y.; Singh, R.; and Raj, B. 2018. Voice impersonation using generative adversarial networks, rXiv:1902.06840v1. Geravanchizadeh, M., and Gavgani, S.B.2020. Selective au- ditory attention detection based on effective connectivity by single-trial EEG. Journal of Neural Engineering, 17 (2). IBM 2019. The quest for AI creativity. artificial-intelligence/ai-creativity.html. Internet Archive 2019. fulDead Kurosawa, A., 1954. Seven Samurai. Tokyo: Toho Co., Ltd. Nyholm, S., and Smids, J. 2016. The ethics of accident-al- gorithms for self-driving cars: an applied trolley prob- lem?. Ethical Theory and Moral Practice 19, 1275–1289. Olenik, A. 2019. What are neural networks not good at? On artificial creativity. Big Data & Society. 6(1)1. https://jour- Raj, B.; Agarwal, S.; and Raj, T. J. 2019. Music generation by deep learning. Unpublished work from Language Technologies Institute, Carnegie Mellon University, Pitts- burg, PA. Rowley, J. 2006. The wisdom hierarchy: representations of the DIKW hierarchy”, Journal of Information Science. Runcon, M. A.; and Jaeger, G. J. 2012. The standard defini- tion of creativity .Creative Research Journal, 24(1). Sharma, N.; and Google 2008. The origin of data infor- mation knowledge wisdom (DIKW) hierarchy. Research Gate. Stein, M. I. 1953. Creativity and culture, Jounral of Psychol- ogy, 36(2). Turing, A. M. 1950. “Computing machinery and intelli- gence”, Mind 49. Zeleny, M. 2005. Human systems management: inte- grating knowledge, management, and systems. World Sci- entific. Zhou, Z.; Yang, Y.; Cai, Z.; Yang, Y.; and Lin, L. 2019, Combined layer GAN for image style transfer”, IEEE Xplore. Zimmerman, K. 2017. History of computers: a brief timeline”, Live Science. Where is the Life the authors have lost in living? Where is the wisdom the authors have lost in knowledge? Where is the knowledge the authors have lost in information? T.S. Eliot, 1934. The Rock Proceedings of the 11th International Conference on Computational Creativity (ICCC’20) ISBN: 978-989-54160-2-8 325