Post by microfarad on Jan 8, 2011 2:21:48 GMT
This post is about my Memtron. But first we'll discuss its parent, the Perceptron.
Have you ever heard of a Perceptron? Well, basically, a Perceptron is an artificial neural net made with a special kind of artificial neuron, also called a Perceptron. There are lots of different types of neural nets, and the Perceptron is one of the most primitive.
A Perceptron has any number of inputs, but only one output. To figure out its output, the Perceptron adds together the product of all of it's inputs and their corresponding weight. There is also a bias value, which is added to that. But I prefer to deal with this bias as just another one of the weights, where it's corresponding "input" is just the number 1. I like doing this because you deal with the bias in just the same way you would deal with the other input weights. But that gets us to a number which is NOT the correct output. A classic Perceptron will return True if this number is above 0, and False if it is below. It is found to be most effective if True=1 and False=-1 rather than True=1 and False=0
Let's get an example 'cause all this stuff is probably making you bored.
Here is a Perceptron which acts like an AND gate
So if we input 0 and 0 we get
(1*0)+(1*0)+(-1.5)
-1.5
<0
False
-1
If we input 1 and 0 we get
(1*1)+(0*1)+(-1.5)
-.5
<0
False
-1
But if we input 1 and 1 we get
(1*1)+(1*1)+(-1.5)
.5
>0
True
1
See? It acts like an AND. But if we want to hook multiple units together, we're going to have to teach it in some easier way than plugging in a bunch of weights. Remember, 1 Perceptron could have WAY more than 2 inputs. So we make it learn using the Delta rule. I am going to give this to you in a comprehensible algorithm.
1. Define the learning rate (I'll call this r). A number like .1 works well
2. Define a set of new weights. We'll replace the old ones with these at the end of this algorithm.
3. Evaluate the Perceptron's output with a set of circumstances (input states), and subtract this output from the desired output given the inputs. We'll call this number "d".
4. For each weight (including the bias) do steps 5-7
5. Define n as the new weight corresponding to the weight in question
6. Define o as the weight in question
7. n=o+(o*d*r)
8. Replace the old weights with the new weights
To properly use this Delta rule, we must systematically teach every combination of inputs and their corresponding outputs (using that algorithm above). And we need to do it lots of times before the weights are correctly adjusted!
But then, if we need to teach a network of Perceptrons, we need to do even more. This is where it gets fuzzy for me. Perceptrons are basically used to take a bunch of raw data as the inputs, and return an abstracted representation of that data at the end. Perceptrons will be comprised of several layers of Perceptron units. The first layer takes the incoming data and outputs it to every single perceptron in the second layer. That means that every Perceptron in the first layer has a connection to a Perceptron in the second layer. The second layer connects similarly to the third layer. And so on. Supposedly, you could represent the pixels in an image on the bottom layer, and the top layer could tell you if it's the letter A, for example. To teach these networks, a method called backrpropagation is used. Here is a simple algorithm for backpropagation, as far as I can discern. It might be wrong and I haven't tested it, unlike the delta rule.
Basically, you present the inputs at the first layer with your data, and hand pick outputs at the top layer for things you might wish it to identify. If you're using images, you could have 3 Perceptrons at the top layer. One should turn on if the image is my logo, one should turn on if the image is a troll face, and the third should turn on if the image is a ball. You could, supposedly, train it to recognize DIFFERENT balls, somewhat altered troll faces, and even alterations to my logo as what they are meant to be. So, the algorithm (might) go like this.
1. Give it the input data.
2. Calculate the result.
3. For every Perceptron in the top layer, do step 4-5.
4. Apply the Delta rule with your desired output value, but DO NOT update the weights, just set aside the new weights to be updated later.
5. For every Perceptron connected to the Perceptron in question, do steps 6-7
6. Apply the Delta rule. If the desired value in the Perceptron in question agrees with its evaluation, the desired value of this Perceptron should be the same as its evaluation, otherwise it should be the inverse. DO NOT update the weights, just set aside the new weights to be updated later
7. Go back to step 7, and perform the loop with this Perceptron. Yes, you will get nested for loops
8. Now that you have taught lots of Perceptrons, update every set of weights to the new set.
And of course you need to repeat that for lots of inputs lots of times.
But I have never tested that so It might not be right.
What of my Memtron? Well, the Memtron is almost exactly like the Perceptron, except Memtrons will output True if they evaluate to over .5, False if they evaluate to less than -.5, and they will output the most recently outputted value if between .5 and -.5
This adds Memory to the concept of perceptrons. I am still testing Memtrons, but they seem to work. I need to come up with an alternative to backpropagation though...
Have you ever heard of a Perceptron? Well, basically, a Perceptron is an artificial neural net made with a special kind of artificial neuron, also called a Perceptron. There are lots of different types of neural nets, and the Perceptron is one of the most primitive.
A Perceptron has any number of inputs, but only one output. To figure out its output, the Perceptron adds together the product of all of it's inputs and their corresponding weight. There is also a bias value, which is added to that. But I prefer to deal with this bias as just another one of the weights, where it's corresponding "input" is just the number 1. I like doing this because you deal with the bias in just the same way you would deal with the other input weights. But that gets us to a number which is NOT the correct output. A classic Perceptron will return True if this number is above 0, and False if it is below. It is found to be most effective if True=1 and False=-1 rather than True=1 and False=0
Let's get an example 'cause all this stuff is probably making you bored.
Here is a Perceptron which acts like an AND gate
Weight1 | 1 |
Weight2 | 1 |
Bias | -1.5 |
So if we input 0 and 0 we get
(1*0)+(1*0)+(-1.5)
-1.5
<0
False
-1
If we input 1 and 0 we get
(1*1)+(0*1)+(-1.5)
-.5
<0
False
-1
But if we input 1 and 1 we get
(1*1)+(1*1)+(-1.5)
.5
>0
True
1
See? It acts like an AND. But if we want to hook multiple units together, we're going to have to teach it in some easier way than plugging in a bunch of weights. Remember, 1 Perceptron could have WAY more than 2 inputs. So we make it learn using the Delta rule. I am going to give this to you in a comprehensible algorithm.
1. Define the learning rate (I'll call this r). A number like .1 works well
2. Define a set of new weights. We'll replace the old ones with these at the end of this algorithm.
3. Evaluate the Perceptron's output with a set of circumstances (input states), and subtract this output from the desired output given the inputs. We'll call this number "d".
4. For each weight (including the bias) do steps 5-7
5. Define n as the new weight corresponding to the weight in question
6. Define o as the weight in question
7. n=o+(o*d*r)
8. Replace the old weights with the new weights
To properly use this Delta rule, we must systematically teach every combination of inputs and their corresponding outputs (using that algorithm above). And we need to do it lots of times before the weights are correctly adjusted!
But then, if we need to teach a network of Perceptrons, we need to do even more. This is where it gets fuzzy for me. Perceptrons are basically used to take a bunch of raw data as the inputs, and return an abstracted representation of that data at the end. Perceptrons will be comprised of several layers of Perceptron units. The first layer takes the incoming data and outputs it to every single perceptron in the second layer. That means that every Perceptron in the first layer has a connection to a Perceptron in the second layer. The second layer connects similarly to the third layer. And so on. Supposedly, you could represent the pixels in an image on the bottom layer, and the top layer could tell you if it's the letter A, for example. To teach these networks, a method called backrpropagation is used. Here is a simple algorithm for backpropagation, as far as I can discern. It might be wrong and I haven't tested it, unlike the delta rule.
Basically, you present the inputs at the first layer with your data, and hand pick outputs at the top layer for things you might wish it to identify. If you're using images, you could have 3 Perceptrons at the top layer. One should turn on if the image is my logo, one should turn on if the image is a troll face, and the third should turn on if the image is a ball. You could, supposedly, train it to recognize DIFFERENT balls, somewhat altered troll faces, and even alterations to my logo as what they are meant to be. So, the algorithm (might) go like this.
1. Give it the input data.
2. Calculate the result.
3. For every Perceptron in the top layer, do step 4-5.
4. Apply the Delta rule with your desired output value, but DO NOT update the weights, just set aside the new weights to be updated later.
5. For every Perceptron connected to the Perceptron in question, do steps 6-7
6. Apply the Delta rule. If the desired value in the Perceptron in question agrees with its evaluation, the desired value of this Perceptron should be the same as its evaluation, otherwise it should be the inverse. DO NOT update the weights, just set aside the new weights to be updated later
7. Go back to step 7, and perform the loop with this Perceptron. Yes, you will get nested for loops
8. Now that you have taught lots of Perceptrons, update every set of weights to the new set.
And of course you need to repeat that for lots of inputs lots of times.
But I have never tested that so It might not be right.
What of my Memtron? Well, the Memtron is almost exactly like the Perceptron, except Memtrons will output True if they evaluate to over .5, False if they evaluate to less than -.5, and they will output the most recently outputted value if between .5 and -.5
This adds Memory to the concept of perceptrons. I am still testing Memtrons, but they seem to work. I need to come up with an alternative to backpropagation though...