Alex,
I agree that robotics includes a very important range of applications. The most important applications of LLMs include the ability to talk (or type) in natural languages to control and communicate with robots and other kinds of systems. And the same kinds of communications will be used for any and every kind of machinery that is stationary, moving, or flying -- as part of any kind of device used on earth or in space.
I'm saying that in order to show that I am an enthusiastic supporter of well designed and safely controlled applications of LLMs. But I also want to emphasize that the second L in LLM is "Language". That is good for the user interface. But it implies a limitation to what can be expressed in ordinary natural languages.
Alex> this topic is overwhelming https://www.v7labs.com/blog/ai-in-robotics
That is an impressive range of valuable applications. But note that NONE of them use LLMs. Please reread the article to see what they do and how they do it. LLMs may be useful to simplify the user interface, but they won't replace the AI methods that are currently being used.
I would also like to cite another very impressive truly graphic application for which language-based LLM would be totally useless (except perhaps for the user interface): "Towards Garment Sewing Pattern Reconstruction from a Single Image", https://arxiv.org/pdf/2311.04218v1.pdf
Note that the operations are multidimensional spatial transformations. Sewing patterns are two-dimensional DIAGRAMS that humans or machines use to cut cloth that is used in constructing a garment. And they are mapped to and from three dimensional structures (a human body and the clothing on it). This is an extremely time-consuming process that humans perform, and words (by humans or LLMs) are useless for specifying how to perform the graphic transformations.
Sewing patterns are just one of an immense range of applications in every branch of construction from houses to bridges to airplanes to space travel and operations by robots along the way. LLMs are hopelessly limited by their dependence on what can be expressed in language. They won't replace the AI in the article you cited, and they would be useless for the AI used to derive sewing patterns -- or many, many other kinds of graphic transformations, stationary or moving.
These issues, by the way, are the topic of the article I'm writing about diagrammatic reasoning by people. The most complicated part is the step from action and perception to diagrams. That article about sewing patterns is an example of the kinds of transformations that the human brain does every second. Those transformations, which Peirce called phaneroscopy, are a prerequisite for language. Most of them are performed in the cerebellum, which is the high performance graphic processing unit (GPU) of the human brain.
Some people claim that they are never consciously aware of thinking in images. That is true because everything in the human GPU (cerebellum) is outside of the cerebral cortex. When people are walking and talking on their cell phones, the cerebellum is in total control -- until they step off the curb and get hit by a bus.
John
PS: It's true that anything expressed in mathematics or computer systems can be translated to a natural language. But the result of writing out what each machine instruction does would be overwhelming. Nobody would do that.
----------------------------------------
From: "Alex Shkotin" <alex.shkotin(a)gmail.com>
IN ADDITION: but this topic is overwhelming https://www.v7labs.com/blog/ai-in-robotics
вт, 28 нояб. 2023 г. в 10:35, Alex Shkotin <alex.shkotin(a)gmail.com>:
John,
It will be interesting to see what types of GenAI or any other AI are used in robotics.
Alex
Alex,
Re LLMs: Of course, Arun and I use LLMs for what they do. Please note our joint article: Majumdar, Arun K., & John F, Sowa (2009) Two paradigms are better than one, and multiple paradigms are even better, Proceedings of ICCS 2009, edited by S. Rudolph, F. Dau, and S.O. Kuznetsov, LNAI 5662, Berlin: Springer, pp. 32-47. https://jfsowa.com/pubs/paradigm.pdf
Two paradigms are better than one, and multiple paradigms are even better. Arun has certainly adopted LLMs as yet another very important paradigm. They can support and extend the 60+ years of tools developed for AI, computer science, technology, etc. But they don't replace them.
But what I do criticize are PEOPLE who claim that LLMs are a universal AI tool that can come close to or even rival human intelligence. For that, there is ZERO evidence. What I have been emphasizing is the immense complexity of the human brain (and the brains of other mammals, especially the great apes). Compared to the human
brain, LLMs are just beginning to scratch the surface.
Alex> GEG is a mental construct. I just thought that blob might be useful when implementing GEG. Mental constructions are usually done at a mathematical level of accuracy
Re blobs: Yes, they are very useful for computational purposes. When GEGs are used to represent computational methods, blobs would certainly be one of many kinds of implementations. Other options would include every kind of digital and analog methods of recording anything. There is no reason to exclude anything that anybody might find useful for any purpose. We have no idea what might be invented in the future.
Re mental: We have to distinguish multiple kinds of things: MENTAL (human brain and brains of other animals); THEORETICAL (pure mathematics); DIGITAL computational; ANALOG computational; and various mixtures of them. For mental, there is a huge amount of research, and every research question opens up a few answers and even more
questions. For theoretical, there are no limitations; mathematicians have proved many important theorems about infinite structures.
Alex> You are using "an open-ended variety of methods". Perhaps mine will be useful to you too.
Of course, when I say "60+ years of AI, computer science, technology, etc," that includes all of them. But there are some that are obsolete or based on mistaken theory. We always need to compare any technology to new discoveries to see which is better. More often than not, we find important features of the old technology that can be combined with the new methods. That's one reason why I emphasize the DOL standard by the Object Management Group (OMG). They emphasize methods of combining, relating, interoperating, and co-existence.
John
----------------------------------------
From: "Alex Shkotin" <alex.shkotin(a)gmail.com>
John,
You criticize LLM all the time, but this is not the Carthage of GenAI, but just a fragrant flower on the
tip of the iceberg. Which everyone uses as a component, just look at our autumn session [1]. And you and Arun use it.
Now the most interesting question is how the Q* and the AI scientist of OpenAI work.
By the way, the main problem with AI is how quickly robots “get smart”, because... they act. Some hotheads might hook them up to an AI supercomputer. Moreover, there may not be an LLM at all. Well, maybe just talk to people.
GEG is a mental construct. I just thought that blob might be useful when implementing GEG. Mental constructions are usually done at a mathematical level of accuracy. This was done for the theory of categories from which the theory of topoi was born. This was done in the HoTT project [1.5]. Probably at some point you will give us a text to read in which GEG will be described as a kind of mathematical super structure. There is, for example, “Topoi: The Categorical Analysis of Logic” [2]. Why not be "GEG, the structure of everything." In addition, there is L.
Horsten "Metaphysics and mathematics of arbitrary objects." [3]
You are using "an open-ended variety of methods". Perhaps mine will be useful to you too.
In "OWL 2 Functional Style operators from HOL point of view" [4] it is shown that HOL is sufficient for OWL2.
In "English is a HOL language message #1X" [5] it is shown that natural language is HOL.
The “Theory framework - knowledge hub message #1” [6] proposes a method for storing theoretical knowledge on a global scale in a concentrated, publicly accessible, and usable manner.
I am now looking at what will be the framework of the theory for genomics and what of this framework and in what form is contained in the GENO ontology [6].
Theoretical knowledge will necessarily include certain mathematical methods, including algorithms. This is the subject of specific research for each specific science. I hope to find out what the mathematics of genomics is next year.
I am looking forward to reading "GEG, the structure of
everything." 🙂
Alex
[1] https://ontologforum.com/index.php/OntologySummit2024
[1.5] https://homotopytypetheory.org/book/
[2] https://projecteuclid.org/ebooks/books-by-independent-authors/Topoi-The-Cat…
[3] https://assets.cambridge.org/97811087/06599/frontmatter/9781108706599_front…
[4] https://www.researchgate.net/publication/336363623_OWL_2_Functional_Style_o…
[5] https://www.researchgate.net/publication/366216531_English_is_a_HOL_languag…
[6] https://obofoundry.org/ontology/geno.html
Alex,
Re Blob: Generalized existential graphs can contain anything in any format inside any area. There is no limit in theory (for proving theorems or analyzing options), but there are limits imposed on whatever implementation (digital or analog or neural) that may use whatever technology is available. A blob is just one among an open-ended variety.
Last week, I sent an excerpt from the article I'm writing. It was called Excerpts234.pdf, dated Nov 17.
I am now sending a larger version Excerpt234.pdf, attached below. The new version includes larger excerpts, and I include a list of references with URLs on the last page. I have referred to some of them before, and I may mention others in my notes. So those references may explain (or confuse) many of the issues we have been discussing.
In particular, the oldest reference is chapter 7 (cs7.pdf), Limits of Conceptualization. This discusses issues about forming concepts, which are a prerequisite for any kind of language or many kinds of diagrams. If I were writing a new version today, I would add many more items, but even that chapter has quite a few complexities to consider.
One revewier, who wrote a favorable review of the book said that he was surprised that Chapter 7 seems to refute everything in the previous chapters. But that is not what I meant. cs7.pdf summarizes the many topics that could not be handled by the AI methods of 1984. And in fact, they still cannot be handled by the latest AI methods today. LLMs can't even begin to handle them. But I'm adding more to Section 7 of my current article to explain why.
Another interesting article from 2007, Language games, A foundation for semantics and ontology, goes into many of the limitations of language of any kind. It also discusses some of the ways of getting around the problems. Those are the kinds of methods that require technologies other than LLMs. You can't solve problems created by language by systems that are based on (or limited by) language.
Anothre example is Two paradigms are better than one, and multiple paradigms are even better (by Majumdar & Sowa). The basic idea is that humans have an open-ended variety of methods of thinking and reasoning. The technologies that Arun and I (and others) have been developing use an open-ended variety of methods -- the more the better, Systems based on a single technology, such as LLMs, cannot have the open-ended flexibility of human intelligence.
That is the basis for all the criticisms I have been making about LLMs. I am not saying that LLMs are bad, I'm just discussing their limitation to a single paradigm. That paradigm is very powerful for what it does. But by itself, it cannot begin to compete with human intelligence.
Other references in that list deal with related issues. Altogether, they show a huge range of issues that require methods other than LLMs. The 60+ years of AI and computer science are not obsolete. They can do many kinds of operations that current LLMs (combined with artificial NNs) cannot begin to do.
John
----------------------------------------
From: "Alex Shkotin" <alex.shkotin(a)gmail.com>
John,
Thank you. This is the usual way to think about reality: complex, chaotic, infinite etc.
And let me add that keeping in a computer any perception like information (pictures, movies, sounds etc.) is known as blob [1].
So I suppose preliminary that keeping GEGs in the computer is not a problem, the processing is a problem.
To be as simple as a programmer, let me propose just to add a blob data type to CL.
For me it is wrong to read notes about theory. I need to read the theory.
By the way in reality it's interesting to follow what kind of perception like information robots operate with.
Alex
[1] https://en.wikipedia.org/wiki/Binary_large_object
Alex,
It's very easy to find such fragments:
Alex> Please give me an example of "complex continuous fragment, which would require an uncountable amount of math to specify".
Just open your eyes. Everything you see has an open-ended amount of complexity. What you see at one glance can be captured with a digital camera that has a finite number of pixels. But if you look closer, the number of pixels for the closer look is the same if you're using the same camera. But the fragment you're viewing now has a much finer resolution.
You could use more technology to get down to higher and higher resolutions. The only limit is determined by Planck's constant, which is the limit of uncertainty.
That's the physical limit. But the mathematical limit can go to arbitrary depths. Any fragment of an arbitrary function on a n-dimensional space is continuous. If you allow such fragments in your formalism, there is no limit to the depths you can select in theory. You can prove general theorems about the depths. But any physical example is limited by the technology.
The formalism for GEGs is specified in Section 3 of Excerpts234.pdf, which I attached to a previous note. Subsets of that have been implemented by my colleague Arun Majumdar. Versions of that description can be implemented with fragments that are limited to whatever resolution you computer can support.
The theory has no limits. But any limitation is limited to what a digital computer can represent. However, it might be possible to go farther if you use some kind of analog-digital hybrid.
John
----------------------------------------
From: "Alex Shkotin" <alex.shkotin(a)gmail.com>
Sent: 11/20/23 2:37 AM
To: ontolog-forum(a)googlegroups.com
Cc: CG <cg(a)lists.iccs-conference.org>, Peirce List <peirce-l(a)list.iupui.edu>
Subject: Re: [ontolog-forum] Diagrams as defined by C. S. Peirce
John,
I should read more about GEG.
Is it possible to put it into the computer?
Please give me an example of "complex continuous fragment, which would require an uncountable amount of math to specify".
Alex
Alex,
Formally defined existential graphs (EGs) and conceptual graphs (CGs) are precisely defined mathematical notations. In fact, CGIF (Conceptual Graph Interchange Format) is defined as an ISO standard representation of Common Logic. Therefore, any or all notations of the Semantic Web can be expressed in terms of CGs.
Furthermore, Peirce's EGs are formally equivalent to the Core subset of Common Logic. Therefore, anything specified in Common Logic can be mapped to a subset of CGIF, which I sometimes call EGIF. But I now prefer a simpler notation called CLIP, which can be translated to and from EGIF or CGIF.
However, Peirce began to define an extended version of EGs, which he called Delta graphs. Unfortunately, he had an accident for which the physician treated him with so much morphine that he was unable to do any serious work for six months. After that he was dying from cancer. But his various writings in the last few years of his life indicate what he planned to do with those extended graphs. From the hints he wrote, I specified an extension I call Generalized EGs (GEGs).
These GEGs can include arbitrary images (even continuous images) as parts. Your examples could be represented as GEGs. But the kinds of images they contain might include arbitrarily complex continuous fragments, which would require an uncountable amount of math to specify.
And thanks for the citation of the article: Images, diagrams, and metaphors: Hypoicons in the context of Peirce’s sixty-six fold classification of signs* PRISCILA FARIAS and JOA˜ O QUEIROZ Images, diagrams, and metaphors: Hypoicons in the context of Peirce’s sixty-six fold classification of signs* PRISCILA FARIAS and JOA˜ O QUEIROZ
I haven't had a chance to read it in detail, but it looks like something I should cite in my article.
John
----------------------------------------
From: "Alex Shkotin" <alex.shkotin(a)gmail.com>
Hypoicons are impressive:
https://philarchive.org/archive/FARIDA
Alex
вс, 19 нояб. 2023 г. в 13:33, Alex Shkotin <alex.shkotin(a)gmail.com>:
John,
Thank you for the very useful description of Peirce's approach. But my question was about the diagrams you develop now. For me, a conceptual graph is a math object. And I think you are developing some extension of CG to get more modeling power.
Alex
Alex,
Thank you for demonstrating that LLMs can answer some questions and summarize some parts of an article (Excerpts234.pdf) without the slightest understanding of what they are doing. LLMs can extract verbal information from a Q/A session about details, But they may fail to understand why the author made those statements. Anybody who reads only those LLM summaries (see below) will have ZERO understanding of how, why, and what Peirce and I were trying to say.
The following question shows that you were misled by the LLMs to misinterpret that article.
Alex> I still think that your diagram is a kind of math object. Is that correct?
That depends on what you mean by a math object. Since mathematics can represent a continuous infinity of possible structures, everything that can be observed, analyzed, or represented is a math object.
Since every possible pattern of any kind can be represented by mathematics (possibly an infinite and/or a very dense finite approximation), calling something a math object means nothing. But Peirce's definitions start with raw experience from the senses (external perception and internal proprioception from bodily processes). That is what Peirce called the phaneron -- mental experience prior to any processing of any kind. Then he analyzed it the way people do.
The first stage of processing (what Peirce called phaneroscopy) selects aspects or parts of the phaneron that the mind (or whatever you want to call the neural processes that analyze and interpret experience) considers important or significant for some reason. Those parts are still raw and unprocessed input that retains all the complexity of the phaneron -- and they contain as much detail as human perception can receive, store, and process. Those parts are also attached to links (relations) that add further information, which happens to be in a discrete graph-like form.
The results of that early processing are HYBRID GRAPHS, which Peirce called hypoicons. (The Greek prefix hypo- is a synonym for the Latin prefix sub-) The discrete links form a conventional style of diagram. But the parts they link are unprocessed, continuous chunks with all the detail of the fragments of the original imagery.
Further processing can analyze the fragments step-by-step into (a) discrete names, symbols and (b) finer and smaller hypoicons. Eventually, all the continuous material is replaced by discrete names, symbols, and relational links. Then and only then can the result be represented by a completely discrete existential graph (EG).
Finally, that discrete EG can be mapped to a discrete language, formal or informal, which can be processed by LLMs.
However, that graph might not have a simple mapping to a sentence in any natural language. Just look at the musical examples in Figure 7 of Excerpt234.pdf. On the left is a diagram of one bar of music in the usual musical notation. On the right is a conceptual graph (CG) which describes every note in the diagram on the left.
Then look at Figure 8, which takes just one small excerpt of the CG in Figure 7, which represents a single note of the bar of music on the left. Then look at the mapping of that one note to a CG and an EG in Figure 8. Then look at the mapping of that CG or EG to an English sentence or a formula in predicate calculus.
Any musician who plays music or any listener who enjoys listening to music, would almost never analyze music as expressed on the left of Figure 7 into Figure 8 or into the formula in predicate calculus. The only exception would be what I just did: do the analysis in order to show that the mapping to predicate calculus is NEVER what people do when they play or enjoy music.
Conclusion: Most of what people do in their daily lives is like music: it cannot be mapped to words or be processed by LLMs without a distortion and misunderstanding of its meaning. If you don't believe that claim, I can point you to an overwhelming amount of research in cognitive science that supports it many, many times over.
Informal analogy: The discrete residue may be adequate for some purposes. But the continuous info is lost. Sometimes the continuous bath water that was thrown away may have taken the baby with it.
John
----------------------------------------
From: "Alex Shkotin" <alex.shkotin(a)gmail.com>
John and Colleagues,
Just as a Saturday addition, have a look at this dialogue [1] with my favorite Claude 2 [2]. After I have uploaded the pdf to it.
Alex
[1]
Here is a Markdown version of our dialog:
# Dialog on Graphs Described in Text Excerpts
**Human**: what type of graphs are described in this text?
**Assistant**: Based on the excerpts provided, the text describes several types of graphs:
- Existential graphs (EGs) - These are a type of logical graph developed by Charles Sanders Peirce. They use nested graphs to represent logical assertions and rules of inference.
- Generalized existential graphs (GEGs) - These are an extension of EGs that can include additional notations such as images/diagrams within the graph areas.
- Conceptual graphs (CGs) - These are a type of semantic graph developed by John Sowa for representing natural language meanings. They use a bipartite structure with concept nodes and relation nodes.
- Diagrams - The text discusses various kinds of diagrams, such as musical diagrams, geometrical diagrams, etc. Diagrams are considered a type of iconic sign that represents relations between objects.
- Phemic sheets - These are diagrammatic representations of mental experience derived from perception and action on the world. They were proposed by Peirce as a way to iconically represent a "universe of discourse."
- Delta graphs - These were a proposed extension of Peirce's existential graphs, intended to support additional features like higher-order logic, probability, and virtual reality representations. They could potentially be represented using generalized existential graphs.
So in summary, the key graph types discussed are existential graphs, conceptual graphs, various diagrams, and proposed extensions like generalized existential graphs and delta graphs. The graphs aim to bridge logical formalisms with perceptual and mental models.
**Human**: what type of diagrams are described in this text?
**Assistant**: Based on the excerpts, the main types of diagrams described are:
- Existential graphs (EGs) - Logical diagrams developed by Peirce to represent propositions and inferences.
- Generalized existential graphs (GEGs) - Extension of EGs that can incorporate images, diagrams, and additional notation.
- Conceptual graphs (CGs) - Semantic networks that represent concepts as nodes and relations between them.
- Musical diagrams - Used to represent notes, melodies, tones, durations, etc. More iconic for musicians.
- Geometric diagrams - Such as those used in Euclid's proofs. Include shapes like points, lines, and triangles.
- Phemic sheets - Diagrammatic representations of mental experience and "universes of discourse."
- Delta graphs - A proposed extension of Peirce's EGs to handle more complex representations.
- Iconic diagrams - Diagrams that directly resemble the objects or relations they represent.
- Symbolic diagrams - More abstract diagrams relying on symbols rather than direct resemblance.
The key point is that diagrams act as an intermediate bridge between visual perception and formal logical languages. Different types of diagrams have different tradeoffs in terms of visual resemblance versus symbolic efficiency.
**Human**: What is a Delta graph?
**Assistant**: Based on the excerpts provided, a Delta graph is a proposed extension to Charles Sanders Peirce's existential graphs (EGs) that Peirce started specifying in 1911.
Some key points about Delta graphs:
- They were intended to extend EGs to support additional representational capabilities, such as higher-order logic, probability, modality, and virtual reality.
- Peirce's specification of Delta graphs was incomplete, but he envisioned them being able to represent things like "stereoscopic moving images."
- The capabilities Peirce wanted Delta graphs to have could potentially be represented using an extension of EGs called generalized existential graphs (GEGs).
- GEGs allow embedding diagrams, images, and notations within the graph areas. This provides a bridge between visual perception and symbolic logic.
- Delta graphs were meant to push EGs beyond just first-order logic and enable representation of Peirce's ideas like tychism (evolution according to chance) and synechism (continuity).
- To fully achieve Peirce's vision would require extending graphs to multiple dimensions and incorporating more advanced mathematics.
- Although Delta graphs were never fully realized, they point toward enhancing the expressiveness and reasoning capability of diagrammatic logics.
So in essence, Delta graphs were proposed by Peirce as a way to build on EGs to create a more versatile diagrammatic reasoning system, but the details were never fully fleshed out before his death. The ideas behind them influenced later extensions like GEGs.
**Human**: give Markdown version of this dialog.
**Assistant**: Here is a Markdown version of our dialog:
(See above)
[2] https://claude.ai/chat/bbfc2686-649e-4654-b47c-217dae51299e
пт, 17 нояб. 2023 г. в 22:19, John F Sowa <sowa(a)bestweb.net>:
Alex,
The diagrams you are thinking about are limited to what can be translated to a formal linear notation. But I am talking about the general theory of diagrams by C. S. Peirce, which is immensely richer. Those diagrams can INCLUDE full multidimensional images as parts at any arbitrarily refined precision.
They can also include as parts any mathematical structures of any kind in any number of dimensions. They can subsume as parts anything and everything that can be included in any digital device or any analog device that can simulate continuity at any resolution.
Please read the attached Excerpt234.pdf from Sections 2, 3, and 4 of the article I'm writing.
To answer your questions, they can represent EVERYTHING you are asking for in your list below.
Please read those excerpts. Then we can begin to discuss the issues in terms of the rich foundation that Peirce was developing. He died before he completed all the work, but he outlined the methods. And I have added some features along the lines that he had suggested.
John
Alex,
The diagrams you are thinking about are limited to what can be translated to a formal linear notation. But I am talking about the general theory of diagrams by C. S. Peirce, which is immensely richer. Those diagrams can INCLUDE full multidimensional images as parts at any arbitrarily refined precision.
They can also include as parts any mathematical structures of any kind in any number of dimensions. They can subsume as parts anything and everything that can be included in any digital device or any analog device that can simulate continuity at any resolution.
Please read the attached Excerpt234.pdf from Sections 2, 3, and 4 of the article I'm writing.
To answer your questions, they can represent EVERYTHING you are asking for in your list below.
Please read those excerpts. Then we can begin to discuss the issues in terms of the rich foundation that Peirce was developing. He died before he completed all the work, but he outlined the methods. And I have added some features along the lines that he had suggested.
John
----------------------------------------
From: "Alex Shkotin" <alex.shkotin(a)gmail.com>
John,
Diagram is just one kind of math structure: mostly labeled or attributed graphs. To model reality we use much much more. One of the most powerful is an analytical function - we can keep a movement inside like l=gt2/2.
I did not find any questions in the MN email. Just a rejection to discuss my very tiny question.
Let me show [1] my short discussion with Claude 2 as an example of participating in.
Queries are mine, answers are from https://claude.ai
Next questions should be:
-What are math structures there?
-What kind of mind tasks may be solved using these structures? Do we have these algorithms?
It is of course possible that these algorithms will use some kind of graph structure to solve mind tasks.
As G.W.F. Hegel wrote ~"All categories are working in any mental act".
Alex
[1]
Q:How to represent in a computer a 3D picture of a real object with the same level of detail as we see it?
Here are a few ways to represent a 3D picture of a real object in a computer with a high level of detail:
- 3D Scanning - Use a 3D scanner to capture the shape and surface details of the real object. This creates a 3D mesh model that can be imported into 3D modeling software. The resolution of the scan determines the level of detail.
- Photogrammetry - Take multiple photos of the object from all angles and use photogrammetry software to generate a 3D model by analyzing the 2D photos. The more photos from more angles, the higher the detail.
- 3D Modeling - Manually model the object in 3D modeling software like Maya, 3ds Max, or Blender. This takes artistic skill but allows full control over the level of detail. High polygon counts capture more detail.
- Point Clouds - Some 3D scanners generate point clouds which are collections of points in 3D space that depict the surface of an object. These can be rendered to visualize the object in very high detail.
- Volumetric Capture - Use special cameras and software to capture a full 3D volumetric video of an object. This captures surface details and internal structures. It requires complex hardware and software.
The key is to capture geometric details like shape, surface patterns, textures as well as color/shade information at a high resolution. The more surface and color data captured from the real object, the more realistic and detailed the 3D representation can be when rendered on a computer.
Q:Where to read about these techologies?
Here are some good resources to read up more on the technologies for capturing detailed 3D models of real objects:
- 3D Scanning: The book "3D Scanning and Printing for Dummies" provides a good overview of 3D scanning methods and technologies. The Artec 3D website also has some helpful articles explaining scanning techniques.
- Photogrammetry: The book "Photogrammetric Computer Vision" goes deep into the math and algorithms. For a more general overview, the Wikipedia article on Photogrammetry is a good starting point. The AliceVision project also has resources on multi-view photogrammetry.
- 3D Modeling: Pixologic has a free ebook "The ZBrush 2020 Handbook" that covers high-detail organic 3D modeling for characters and objects. The book "3D Art Essentials" also covers 3D modeling principles using Maya.
- Point Clouds: The journal paper "A survey of point-cloud semantics in geometry processing" provides a technical survey of point cloud techniques. The Point Cloud Library (PCL) documentation also explains point cloud concepts.
- Volumetric Capture: Microsoft has research papers and videos explaining volumetric capture on their Mixed Reality Capture Studios page. The book "3D Video Technologies" also covers volumetric video concepts.
For a comprehensive overview, the Siggraph organization has many technical resources on 3D imaging and modeling, including courses and videos from their conferences.
Alex,
Mihai Nadin is asking very important questions. Perception and action are fundamental for every kind of thinking. When you perceive something, that sets the sage for anticipating action. The anticipation stimulates the thinking that leads to the action. I have emphasized the methods that Peirce developed in detail, but I recognize that anticipation is an important piece of the puzzle. LLMs, by themselves, don't contribute anything useful to those issues, but they can be important for communication. That's what they were designed for: machine translation among languages, natural or artificial.
Alex> my main topic is How to represent in a computer a 3D picture of a real object with the same level of detail as we see it.
Short answer: Impossible with LLMs, but methods of virtual reality are developing useful approximations.
Next question: How do humans and other animals process the continuous imagery they perceive, decide what to do, and do it. And maybe if there is some reason to communicate with other animals friendly or not, decide to activate their communication methods. On this latter point, LLMs promise to make important contributions.
Language follows the heavy-duty thinking. Its focus is on communication. But it's impossible to understand what and how language communicates without starting at the beginning and following the many steps before language gets involved in the process.
I emphasized diagrams as an important intermediate stage. The first step from imagery to diagrams to language begins by breaking up the continuum of perception and action into multiple significant image fragments and their interrelationships.
Those fragments, which Peirce called hypoicons, retain a great deal of the continuity, You now have a diagram that links continuous parts to one another in two different ways: (1) geometrical positions in the original larger image, and (2) symbolic relations that identify and relate those fragments.
This analysis continues step by step to replace continuous parts with symbols that name them or describe them with discrete detail. But many interactions and operations can take place at that early stage. When you touch something hot, you don't have to identify it before you jump away from it.
Some people say that they never think in images. That is because they don't have a clue of what goes on in their brains. A huge amount of the computation on perception and action takes place in the cerebellum, which contains over 4 times as many neurons as the cerebral cortex. In effect, the cerebellum is the Graphic Processing Unit (GPU) that does the heavy duty computation.
Nothing in the cerebellum is conscious, but all its computations are processing that continuum of raw sensations from the senses and the huge number of controls that go to the muscles. That is an immense amount of computation and INTELLIGENCE that takes place before language even begins to play a role in what people call conscious thought.
The final diagrams that have replaced all the raw imagery with discrete symbols on the nodes and links of a diagram are the last stage before language -- in fact language is nothing more than a linearized diagram designed for translation to linearized speech.
The overwhelming majority of our actions bypass language interpretation and communication. Thatt's why people get in trouble when they're walking of driving while talking on a cell phone. While their attention is focused on talking, the rest of their body is on autopilot.
All the heavy duty intelligence occurs long before language is involved. Language reports what we already thought. It is not the primary source of thought. However, language that we hear or read does interact with all the imagery (AKA virtual reality) in the brain. The best intelligence integrates all aspects of neural processing.
But language that does not involve the deeper mechanisms is superficial. That's why LLMs are often very superficial. The only deeper thought they produce is plagiarized from something that some human thought and wrote.
And by the way, I recommend the writings in https://www.nadin.ws/wp-content/uploads/2012/06/edit_prolegomena.pdf
They're compatible with what I wrote about Peirce, but I believe that Peirce's analyses of related issues went deeper into the complex interactions. Those issues about anticipatory systems are compatible and supplementary to Peirce's writings, which I believe are essential for relating the complexities of intelligence to the latest and greatest research in AI today.
John
----------------------------------------
From: "Alex Shkotin" <alex.shkotin(a)gmail.com>
Dear and respected Mihai Nadin,
I look at my description as a problem statement. Your email means that you will not take part in the discussion of this problem. I'm truly sorry.
Best wishes,
Alexander Shkotin
чт, 16 нояб. 2023 г. в 00:32, Nadin, Mihai <nadin(a)utdallas.edu>:
Dear and respected Alex Shkotin,
Dear and respected colleagues,
- YOU wrote:
my main topic is How to represent in a computer a 3D picture of a real object with the same level of detail as we see it.
Let us be clear: the semiotics of representation provides knowledge about the subject. The topic you describe (your words) is in this sense a false subject. May I invite your attention to https://www.nadin.ws/wp-content/uploads/2012/06/edit_prolegomena.pdf
Representations are by their nature incomplete. They are of the nature of induction.
Visual perception is informed by what we see, but also by what we think, by previous experiences.
- Mathematics: I brought to your attention (long ago) the impressive work of I.M. Gel’fand. Read his work—the limits of mathematical representations (and on operations of such representations) are discussed in some detail.
- Mathematics and logic—leave enough room for diagrammatic thinking as a form of logical activity. C.S. Peirce (which John Sowa relates to often) deserve also your time. Read his work on diagrams. Mathematical thinking is not reducible to logical thinking (in diagrams or not). The so-called natural language (of double articulation) is more powerful than the language of mathematics—it allows for inferences in the domain of ambiguity. It is less precise, but more expressive.
- After all ontology engineering is nothing else—HUGE SUBJECT—but the attempt to provide machines, working on a 2 letter alphabet under the guidance of Boolean logic, with operational representations of language descriptions of reality.
Best wishes.
Mihai Nadin