New Websites and Other Work

As I noted recently in a post on my personal site, I have been working on my websites a lot lately, which includes creating several new ones. Notable are MakingSocietyWork, DecisionMakingAndEstimation, ErrorCovariance, BipartiteMatching and ContentManagement. I have added these sites because I am now attempting to put together a network of sites, one for each of the main ideas behind what I am calling the SocialSystemsProject.

Whether anything come of this or not will depend on public reception of my sites. I am not expecting any. There are just too many other websites out there, too many distractions, and people just don’t care.

Well, all I can do is work away at it and hope.

Posted in Uncategorized | Leave a comment

Recent “About” Page

I’ve completely changed the home page for this site.  Here is the former one:

This site is nominally devoted to Social Technology, but will include discussions, not only of related topics but of topics dear to the heart of its author.

Most especially, this site exists to promote a specific view of Social Technology. It is one highly influenced by mathematics and information theory.  Various pages and posts explain how and why.

A fundamental concept is that of a social network, which is what computer scientists and mathematicians call a graph.  Each link, or edge, in such a graph may be given a weight, which is important for manipulating the graph.  Here we discuss graphs (networks) in which the weight or strength of a link is a representation of how well the individuals in the social network communicate.  This is discussed in terms of signal strength and distortion, important concepts in information theory.

Signal strength is further broken down into the compatibility of the individuals involved and the amount of communication between them.  Compatibility also affects distortion, so it would be more correct to express signal strength and distortion in terms of compatibility and contact.

An analysis of social networks in this way reveals that how badly information flow through them is affected by how many compatible friends one has, and at what levels.  A scale is presented for judging the compatibility, and its use is explained. From this analysis one basic conclusion is drawn: that having few very compatible friends and being in close contact with them is much much better than having many less important “friends” in your life.

Posted in Uncategorized | Leave a comment

Some Visual Links

Click on the images to visit related sites:

An online novel about applying social technology in third world countries.

Online Novel about Social Tech Missionaries

Online Novel about Social Tech Missionaries

A website about languages including programs, data and fiction is at:

Natural, Artificial and Non-Arbitrary Languages

Natural, Artificial and Non-Arbitrary Languages

A complete online novel about the application of social technology in a high-school is:

Social Tech High

Social Tech High

Posted in Uncategorized | Leave a comment

work in progress

Points of View

Power and Influence Structures

The Role of Requirements Analysis in Social Technology

What is combinatorial optimization? What has it to do with jobs?

Comparison with Nanotechnolgy

Acronymic language

Impedance

Eigenvectors and Archetypes

Academia

Feasibility

Social Filtering

Linearity

Some philosophical thoughts

About these pages, with pagelist


Copyright © 1998 Douglas P. Wilson

Posted in Old Pages | Leave a comment

What is Combinatorial Optimization?

Why you need have no fear of being “optimized”

And what does it have to do with jobs?

Optimization just means “finding the best”, and the word `combinatorial’ is just a six syllable way of saying that the problem involves discrete choices, unlike the older and better known kind of optimization which seeks to find numerical values.

The employment or assignment problem, matching people with jobs, is a what used to be called a pigeonhole problem. There are a number of discrete slots, or pigeonholes, which we might number 1, 2, 3, 4, 5, 6, … and there are a number of objects, A, B, C, D, E, F, … which have to be put in the most appropriate pigeonhole. We call assignment of objects to slots a feasible solution if assigns each object to a unique slot. Here are some feasible solutions:

1A, 2B, 3C, 4D, 5E, 6F, … (simplistic initial solution, match in order listed)

1B, 2A, 3D, 4C, 5F, 6E, … (start with initial solution, swap pairs)

If there are N slots and equally many objects, then there are N ways of assigning the first object, N-1 ways of assigning the second object and so on, so that the number of solutions is N! (which is pronounced “factorial N”). If there are 3 slots, there are 6 different feasible solutions: 1A, 2B, 3C 1A, 2C, 3B 1B, 2A, 3C 1B, 2C, 3A 1C, 2A, 3B 1C, 2B, 3A

But, if there are 6 slots and objects, there are 720 feasible solutions, if there are 12 slots and objects there are 479 Million different solutions, and if there are 24 different slots and objects the number of feasible solutions is 6.2045 times 10 to the 23rd power.

The technical term for this is Combinatorial Explosion — the explosively rapid increase in the number of possibilities with the number of items to be assigned.

Suppose we are dealing with a small company that has 50 employees. If everyone was interchangeable with everone else, so you could really give any job to any person, then we would be actually dealing with the pigeonhole problem, and the number of possible ways of assigning people to jobs is 3.0414 times 10 the the power of 64. But, more realistically, suppose that each person is only suited to two jobs out of the 50. Then the number of different feasible solutions is about 2 to the 50th power, which is 1,125,899,906,842,624 or about one million billion.

When the search-space or solution-space of feasible solutions is very large, the chances that any real solution arrived at by management is actually a good solution becomes very small. The solution-space is too big to search in any effective way, so people just pick the best solution they happen to stumble across.

This creates what I call the job-mismatch problem: most people are in the wrong job because finding the right jobs for everybody is a combinatorial nightmare: too many choices.

To my way of thinking, the unemployment problems facing many nations today  is just the tip of the job-mismatch iceberg. It is so hard to match up people to jobs that lots of people just don’t get assigned to jobs at all.

I admit that there may be job shortages in very specific situations, but I’m convinced that on the whole, unemployment just a combinatorial problem, not a job shortage problem. If you insist that you are a coal miner and nothing but a coal miner, and you insist in living in the same town you have always lived in, and refuse to travel long distances to work, then there may indeed be a job shortage when the only pit in town closes. But in general, there’s no job shortage.

Actually, the very idea of a job is a relic of earlier centuries. We can say instead that there are many tasks to be done and many people who are available to do them. The real combinatorial optimization problem is not matching people to jobs, but the ongoing problem of matching people to tasks. Having a set or sequence of tasks grouped together into a job, then finding a person to do that job is actually just a crude way of approaching the harder problem of finding people to do the specific tasks.

So there are at least two ways in which all this talk about unemployment is completely off-the-rails: first, it is not an economic problem involving job shortages, it is a combinatorial optimization problem. Second, it is not about jobs at all, it is about matching people up with specific tasks.

The point here is not just that everyone else has been addressing the wrong problem, but that nobody has dealt with the problem in the right way at all.

Jobs seem to depend on economics, since high unemployment and poor economic conditions coincide.  Unemployment seems to result from poor economic conditions — indeed, we often see companies in trouble laying off employees.

But I think we have the causal connection the wrong way around: I think economic conditions depend on how well we solve the job-mismatch problem.  Companies do fail because of bad management, but if it was easy for people to find  good jobs, those employees would have found other jobs long before the company got to the point of laying off workers.   People hang around in dead end jobs or in companies doomed by bad management because they feel they have no choice: it is too hard to find another job.

Unfortunately, we haven’t addressed the problem in the way software and systems engineers do — instead we leave it to politicians, whose only qualification seems to be the ability to get elected to public office.

Any qualified software engineer faced with a problem in which certain items were not finding slots would immediately look at the more general problem of matching items to slots, and would be certain to investigate the combinatorial aspects of the problem.

Let me generalize a bit: we need to match people to tasks, but we also need to match people with co-workers. In fact, almost all social problems involve matching and optimization.

Underlying almost all the social ills of our society is the combinatorial explosion of possibilities and the lack of adequate techniques for reducing the size of the search space.   Technology based on combinatorial optimization theory can provide ways around such problems.  It turns out that the “assignment problem” or “bipartite matching problem” is quite approachable — computationally intensive, but approachable.  There are good algorithms for solving it.

One of the strong claims made by Soviet ideologists was that they had no unemployment because of  the advantages of central planning.  They were wrong.   Essentially they thought that they could solve the assignment problem for a society of millions of people by having a central core of  bureaucrats figure out who should do what, and they were wrong.   But exactly why were they wrong?

It all comes down to the size of the problem.  Assignment problems are roughly O(3) problems, meaning that the amount of work needed to solve them depends on the cube of the number of elements (that’s for Gabor’s algorithm).  Matching a few thousand nodes can take 30 minutes or so on my aging 120 Mhz Pentium, but matching a few million nodes (a thousand times as many) would take not 1000 times as long but 1,000,000,000 times as long.

Even if they had all the job-suitability, or skills and aptitude data already prepared, just doing the calculations involved in matching 50 or 100 million people to jobs would have taken more computing power than existed on earth at the time at the time the Soviet Union dissolved.

Clever approximation algorithms could probably approximate a good solution using computing power that is available today, but that could only be true if the true problem had been properly addressed.

Copyright © 1993, 1995, and 1998  Douglas P. Wilson

Posted in Old Pages | Leave a comment

Web Interface — Analysis and Design

This post was an old page which provided a brief analysis and design sketch for the web interface to be provided for the prototype software discussed on these pages.

(As of 2013, I am focusing on speech and facial recognition software instead of web based questionnaire, but what follows may be of some interest anyway.)

Analysis — What’s Required

It has to be interesting.  Fun, if possible.  It has to demonstrate the basic ideas so people have some hope of understanding them. It has to hint strongly at great things to come, to inspire people to look back here again.

That’s all true, but what specific requirements are to be met?  That’s still an open question, but here are a few notes:

  • it has to collect data for offline analysis, (but …)
  • it has to be an online interactive interface that people can get immediate feedback from
  • it has to run mostly on the users own machine, as client-side intensive software, (using Javascript or Java) not on the server side (using Perl or PHP) except where necessary
  • it must demonstrate as much as possible of the prototype software, even those parts hard for people to understand
  • it must provide for people to download and use much of that software on their own machines, and support that use with online explanation and help
  • it has to encourage others to help and provide mechanisms for them to find what they might help with and coordinate their work with others.

(much more analysis is needed, and will follow as soon as possible …)

Design — How to Build It

Most of the design work is yet to be done, and most of what has been done remains entirely in Doug Wilson’s disorganized brain.  Here are some design notes, representing just a small part of that work:

  • ordinary HTML forms will be used, such as the one now on the What Do You Want? page, and the data submitted with these forms will (at least) be mailed for offline processing
  • that data should also be mailed back to the user, not only for confirmation, but so that the user can do some offline processing with downloaded software
  • when possible, data collected by forms should also be processed online to provide more immediate user feedback — this will meaning having online some other data for comparison purposes, either data previously collected, or existing social survey data
  • comparisons can be represented as scattergrams, showing a representative sampling of other user data, possibly resampled, in one colour with the position of the user in that data slice shown as a distinctive other colour
  • as soon as possible some use of existing social survey data for online demos will be provided as a teaching tool, and to explain what we can and will do with user data
  • matching for the various purposes listed in the CASA Proposal and outlined on the original What Do You Want? form will probably have to occur offline
  • the results of bipartite matching will be a suggested match for each person, company, (…group, etc.) with some person (etc.) and this will be communicated in some way to the people or companies involved using some kind of protocol that will allow the interested parties to communicate while remaining anonymous until they learn more about each other
  • as well as online or e-mail suggestions with confidentiality protocol, a real world approach would be to suggest meetings in groups by suggesting times and places for people to meet, where each meeting will include as many mutually-compatible individuals as possible, but without them knowing who may be compatible for what purposes
  • througout this process feedback from the users will be collected with online forms — the exact way in which this feedback will be used is an enormous issue of itself, but in general it will be used with something like the generali
  • as well as matching people, organizations, and other things, as suggested in the CASA Proposal, use of information obtained by spiders about web sites and web pages, together with information about those sites and pages from their owners, can be used with numerical methods for site mapping, mapping the Internet, suggesting useful links, and other purposes that use some of the prototype software but are not matching problems in the usual sense

These are just notes, hardly even the beginning of a design, but they hint at design directions and will be followed by explicit material as it becomes available.   For more information, see the How It Works page , the (still rather sketchy) Web Interface Design page , and  you might also look at the more journal-like record of dealing with the frustrating tools and techniques of web interface programming, on the page about Putting it on the Web .

Copyright © 2000 Douglas P. Wilson

Posted in Old Pages | Leave a comment

What (was) New

(circa 1998)

I am trying to expand my collection of web pages to include other topics, such as computer programming languages, operating systems, and politics , and will add such links occasionally — so please check back here from time to time, to see what’s new.   I should have a “What’s New” page, I guess, but for now I’ll just add links to the new stuff here, where it will stay until linked in with existing text.

Here is what is new as of  Saturday, January 9th, 1999:

A page about the Global Ideas Bank at the Institute for Social Inventions and the prizes they offer for the best new ideas.

Two ideas from Nicholas Albery, chairman of the Institute for Social Inventions and editor of  “The Book of Visions” and “World’s Best Ideas” and other ISI publications.

Here’s what was new as of Sunday, Dec. 20th, 1998 —

An easier and less controversial matching problem, matching people for communication on the web or by e-mail, is described on a page I call ” Net Net Baud Rate “, (a silly play on words).  The material there is not actually very technical, despite the name.

For the sake of communicating with a correspondent who takes objection to my attempt to reduce the whole of  human communication to “net baud rate”, I am putting up here an essay on reductionism.

I’ve added a page about my educational background , to supplement the one on my academic interests posted earlier.

Here’s what was new as of Friday, Dec. 4th, 1998 —

The Acronymic Language — some more fundamental ideas about the ideal language project discussed elsewhere .

Corporations — ideas about changes to corporate law to reduce the undue influence of large corporations by discouraging predatory behaviour (instead of rewarding it).

Academic Interests — some of my educational background (the whitewashed version).

The Sailboat Metaphor — a discussion of free will and determinism.

The Particle Accelerator Metaphor — an alternative to the sailboat metaphor that emphasizes matching.

Here’s what was new as of Monday, Nov. 2nd, 1998 —

The Video Store example , an example of some of the methods that can be applied to provide people with useful suggestions.

Business Applications , which addresses more general  business applications of these methods, including team formation by matching co-workers in a business and using carefully matched teams of co-workers to estimate important numbers such as project costs or even stock prices.

Crime and Punishment spells out why the future world I describe will have almost no crime, together with ideas for dealing with today’s prisoners and the very few criminals which may exist in the future.

Here is what was new as of Monday, Oct. 26, 1998 —

The Social Technology Page is a new title for the page formerly called “The Idea of Social Technology”, and it contains new content as well, including the anchors for the above two links.

These pages were new  as of Monday, Oct.19, 1998 —

The Role of Requirements Analysis in Social Technology

What is combinatorial optimization? What has it to do with jobs?

Power and Influence Structures

Points of View

Copyright © 1999   Douglas Pardoe Wilson

Posted in Old Pages | Leave a comment

Bootstrapping Society

a philosophically troubling case of recursion

So, (rather a leap here, I’m afraid) computers may help a bit, at least in bootstrapping the process, but in the long run they are rather irrelevant: too non-linear at present, and in the future probably bound to be more and more linear for the same reasons we are. Hofstadter has written about this in Metamagical Themas, where he discusses the idea the intelligent machines may have to develop human-like limitations.

But the bootstrapping process may be important. Picture this: if you are well tied into the social network you will have people who are good interfaces between yourself and the rest of the net — people whose distortions you can compensate for, people who have a good bandwidth when matched with you. Also you have people as friends who have the same interests you do, but tend to make different mistakes than you tend to make, and so you have ways of making better decisions then you could alone.

Indeed, if you are properly matched with people, to level 6 or better, you will be very well connected and complemented and so can make very good decisions, such as decisions about what job to take, what friends to have, and so on. If you just had compatible connections that would be enough to find you compatible connections, and so on.

If you are on the humanities side of the Two Cultures, you may find this philosophically troubling, whereas if you are on the science side, and especially if you are in mathematics or computer science, you may simply see this as a recursive procedure.

(To make sense of this you might need to see the linearity page.)

Copyright © 1998 Douglas P. Wilson

Posted in Old Pages | Leave a comment

video recomender system

The Video Store Example

There are two main problem that have bothered people in the past when I have tried to explain my views  about social technology and social network optimization to friends and family. One problem is that the concepts are too hard to follow. I keep trying to address that problem with new text, but I am probably doomed to keep doing so for a long time to come, because nothing I write is quite easy enough to understand. The other problem is more approachable, because it concerns only the technical feasibility of what I am proposing.

In general I find it much easier to address technical problems, which are more concrete than some overall failure to understand the conceptual foundations behind them.

So this page is addressed to people who have some idea about what I’m trying to do, but don’t see any mechanism for it to work.

I am going to start with the hypothetical video-rental-store problem, an example that first came to mind a decade ago when I found myself taking my daughter Clara and niece Marika over and over again to the video store in search of something to watch.

I assume that this hypothetical store, like many others, maintains a membership list for customers and provides them with a membership card so they don’t have to keep providing a lot of ID when renting videos. I will assume they have a computer to handle this membership and that they keep a database of information on each customer — for the moment let’s ignore the privacy and confidentiality issues.

What follows are more unusual assumptions, but quite technically feasible:

  • I assume that in an effort to increase rentals this video store intends to give customers a page of personalized suggestions when they enter the store, and I assume a printer on the counter near the entrance.
  • I assume that videos have some magnetic or optical code on them that can be read automatically be an optical or magnetic reader located in the video-return slots.
  • I assume that videos are returned through labelled slots in the front of the counter near the entrance, out of sight of the cashier and other customers — and I further assume that these slots are clearly labelled “liked”, “disliked”, “didn’t watch”, or some similar rating system.

So the basic idea is quite obvious. Customers return videos through the appropriate slots and thus rate the video. Ideally the video store wants each customer returning videos to rent some new ones, so as soon as the customer returns the videos from last night the printer on the counter must spew out a page of recommendations, and they should be good ones, truly personalized to that customer’s tastes, however bizarre they may be.

It is important to emphasize at this point that I see this all as technology, not science, but for the sake of explaning how this all fits together I will add a video-scientist to the mix at an appropriate time.

But let’s see what the technology can do by itself, without an scientist being involved. First of all, as customers rent videos the computer adds the names or numbers of those videos to the database. When videos are returned the customers have been instructed to deposit them in the appropriate slot, to give the video store an electronic review — did the customer like it or not.

The computer can then maintain two sets of records, one for each customer, and another for each video. Essentially the record for each customer lists each video rented and the slot through which it was returned: liked, disliked, or whatever. The record for each video lists which customers rented it and the same mini-review from the return slots.

So far so good, but that’s just record keeping, but what matters to both the customers and video store is the accuracy of the recommendations make in the printout provided to the customer on entering the store. If these recommendations do truly match the customer’s tastes, the customer will be happy, and will probably rent more videos, thus making the store happy too.

Essentially the recommendations are predictions, and these predictions will be confirmed or refuted when the customer returns the videos.

The underlying technology for doing this is entirely straightforward. To predict the customer’s response to a video the system can compare each customer to all the other customers and find some customers with similar tastes; it can also compare each video to all the other videos and find which other videos are most similar.

Two videos are similar if they appeal to similar customers, and two customers are similar if they like similar videos. That might seem like a paradox in which you can’t tell what videos are similar until know which customers are similar — but can’t tell which customers are similar until you know which videos are similar — but it is not a paradox at all.

One starts the process by just noting which customers liked the same videos, and which videos are liked by the same customers — then you can use successive approximation to include similarity data as well.

If the number of videos available for rental is reasonably small, a few thousand perhaps, and the number of customers is also not too large — again a few thousand, then the amount of math required to do this quite small.

Essentially the computer looks up your record, then searches the other customer’s records to find similar lists of videos liked and disliked. Then videos liked by customers whose taste seems to be like yours can be recommended to you. Not a difficult concept.

But suppose that instead of a single store doing this on their own what we have is a national or international chain of video stores, like Blockbusters, and let us suppose they collect this information for all of their customers all over the planet, a hundred million or more. I don’t know how many they carry, but a typical book of video reviews has a little more than 20,000 listed.

This is now a much juicier problem now, with lots more data to crunch. It is no longer possible to simply compare your record to those of all the other customers, so a lot more math is needed. Most likely this will be something to do a kind of data-compression based on eigenvectors — but you don’t need to know the details, all you really need to know is that the math attempts to simulate the performance of an uncompressed database.

If it was possible to compare your record of video rentals and ratings with those of the other 100 million customers, it would be quite easy to pick a hundred or so people (1-in-a-million) whose tastes are almost identical with yours, and they recommend to you things they liked. But we can’t do exactly that since it would require too much computer power, so we use a compression scheme, which attempts to simulate what we can no longer do because of the size of the database. The better the math, the better it can simulate it.

There is actually a nice side-effect of the compression scheme, most of which do a little interpolation within the data or extrapolation beyond it and can make predictions about videos that nobody quite like you has seen yet. More about that another time.

Well, the time has come to introduce a video-scientist. Let us suppose that some intrepid doctoral-student manages to talk the chain of video stored into letting him use their database — the whole thing — in his research. This researcher has read many papers on the use of factor analysis in psychology, and decides to write his dissertation on the most important factors in choosing a video. He performs factor analysis on the data and discovers that the most important single factor is … well, whatever, and fills out his analysis by listing the other 19 most important factors, for a total of 20.

But just as he finishes his research and written his dissertation he stumbles across another, much smaller database of compressed data that the stores use for making their recommendations, and discovers that instead of using a huge list of customer preferences they have a compressed table that describes each video in 20 numbers based on the eigenvectors of the autocorrelation matrix of the rows in the large uncompressed table.

Of course the video stores don’t know what those numbers mean, they’re really just compressed data. But compressing by eigenvectors that way is more-or-less the same as the factor analysis used by so many psychologists in their research — and in this case it turns out that the 20 numbers per video of the compressed table are exactly the same as the 20 largest factors extracted by this young scientist.

So we actually do have a kind of technology that does something not unlike a kind of science, extracting the major factors from some large table of data. But of course the technology just uses the compressed data for its own purposes, it doesn’t name the factors or describe them, and it certainly doesn’t write them up for publication in Psychological Reviews.

The important point I want to get across is this: all of what I have just written about is perfectly straightforward stuff using known techniques — known technology, that is. It does not require some “theory of video preferences” to work, although in some cases part of what the technology generates may be not unlike such a theory. It’s all known technology, though, nothing mysterious, nothing controversial.

I could have written this about libraries or bookstores instead of video stores — and now that I think about it I think I did at some point a while back.

When it comes to people the same basic ideas are involved. If it was just a question of searching for compatible people it would be almost exactly the same. It is not hard to imagine a large bordello which operated exactly on this principle and was able to keep each customer quite happy by using its computer to do exactly what the video store did in my example. As long as customers return their girls through the right slots it would work perfectly. Known technology.

There is a lot more math in what I have called Social Network Optimization , but it is not a question of needing to figure out a new theory of personality. We don’t need to do that any more than we needed to figure out a new theory of video preferences — its just a question of collecting and using the data. The extra math only comes in the attempt to simultaneously satisfy a large number of people and keep them satisfied, and again no theory is required — no science is required — it’s just technology.


Copyright © 1998 Douglas P. Wilson

Posted in Old Pages | Leave a comment

Possible uses for Social Technology

Help find any of these aspects of your social environment:

Copyright © 2010 Douglas P. Wilson

Posted in Old Pages | Leave a comment