Monday, 15 July 2013

Notes from Build 2013 HTML5 vs XAML, Neural Networks and security


I met some people that where very pro HTML5 seeing this as the future. I met some people that where very pro XAML and see this as the future. After the silent demise of Silverlight it’s a bit difficult to know where the future is.

On the HTML5 front I intend to catch up on subjects such as Bootstrap (MVC5), TypeScript, Reactive Binding (MDV ECMA), ShadowOn, Angular, Bower, Command.js, Node/Grat, Handlebars.js...

I heard a rumor that out of 10 projects MS started in HTML5 7 have been rewritten in XAML. The pragmatic approach of hybrid solutions is the way to go. Use HTML5 when it makes sense but be aware there is a cost associated with its use. WPF is a more elegant solution because it uses OO and properly separates concerns. But when you need customer reach then HTML5 is the way to go but be careful that the target customers have browsers capable of handling html5,

A friend of mine will be working together with xamarin to provide a VS template with MVVM Light for ios native apps with portable c# libraries
 
 

Here’s a session that I can highly recommend
 
4-554 Building Big: Lessons Learned from Windows Azure Customers


It’s not likely that I will be writing software that needs to scale in quite the same way as described in this session. The real life examples where very interesting and definitely worth watching.

Another session that is definitely worth watching is

AZ-18 Securing Windows Store Applications and REST services with Active Directory



The talk was arrange around a story and was very good:

  1. The story started around an isolated corporate network that had users, resources and access control that could be administered easily.
  2. Then along comes an external resource that needs to be accessed by domain users and the administrator looks a little less happy
  3. Next external users need to access domain resources which really upsets the administrator
  4. Finally BYOD need to access domain resources (Vittorio then drew the picture of the screem)

REST OAuth2

  1. A user enters a code on an authorization endpoint
  2. The user reçoives a code
  3. The user sends this code to a token endpoint and receives am Authorization Token
  4. This Token can then be used to access external resources
  5. There is a Reentry token that allows the authorization token to be cached for a limited period of time

Windows Azure Active Directory

  • This can be stand alone or a synchronized part of an on premise AD directory
  • Supports OAuth2, SAML-P, WS-Federation and has metadata end points
  • There is a OneClient preview in the Azure portal that is used to maintain the Azure AD
  • Windows Azure Authentication Library (AAL)
    See presentation for links how to use this
     

Essential AAL Ussage
  • Authenticate the user to get a token:
    AuthenticationContext aCtx= new AuthenticationContext(
    AuthenticationResult = result = await authorizationContext.AquireTokenAsync(
  • Use the token to invoke a REST service
    HttpClient httpClient = new HttpClient();
    httpClient.DefaultRequestHeaders.Authorization = new AuthorizationHeaderValue("Bearer", result.AccessToken);
      
Although I don’t have a direct application of Neuron networks this talk was really well presented:

2-401 Developing Neural Networks Using Visual Studio

 
Agenda
 
  1. What types of problems does a neural network solve
  2. What exactly is a neural network
  3. How does a neural network actual work
  4. Understanding activation functions
  5. Alternatives to neural networks
  6. Understanding neural networks training
  7. Neural network over-fitting
  8. Developing with Visual Studio
  9. Summary and resources

What types of problems does a neural network solve

Tabular information where you have some inputs (independent variables) to produce an output (the thing you want to predict). The idea is that you have some training data that is used to fit internal variables of the neural network after which you have a system that can predict an output from a given set of inputs

What exactly is a neural network

The inputs are normalized, Boolean variables are converted to -1 and +1, enumerations to a set of individual inputs that are set to 0 or 1. There are then used in the input nodes of the neural network. Then a to be determined number of hidden nodes evaluate a function based on all these inputs to produce a set of output nodes.

Activation Functions

  • Logistic sigmoid output between [0,1] y=1.0/(1.0 +exp(x))
  • Hyperbolic tangent output value between [-1, +1] y = tanh(x) = (ex - e-x)/(ex + e-x)
  • Heaviside step output value between [0,1] if (x<0 else="" if="" then="" x="" y="0">=0) then y=1
  • Softmax output between [0,1] and sum to 1.0 y=(e-x)/Sum(e-xj)
The ability to customize these functions means that it is often better top write your own neural network
 

Alternatives to neural networks
  1. Linear regression y = a x1 + bx2 + .....
  2. Logistic regression y = 1.o/(1,0+e-(ax1 + bx2 + ... + k))
  3. Naive Bayes: assumes input data are all independent and output is binary
  4. Decision trees
  5. Support vector machines: extremely complex implementation, assumes binary output

Neural networks pros and cons

  • Pro: can model any underlying math equation!
  • Pro: can handle multinomial output without resorting to tricks.
  • Con: moderate complexity, requires lots of training data.
  • Con: must pick number hidden nodes, activation functions, input/output encoding, error definition.
  • Con: must pick training method, training “free parameters,”
    (and over-fitting defense strategy).
     

Training

Back-propagation
Fastest technique.
Does not work with Heaviside activation.
Requires “learning rate” and “momentum.”
 
Genetic algorithm
Slowest technique.
Generally most effective.
Requires “population size,” “mutation rate,” “max generations,” “selection probability.”
 
Particle swarm optimization
Good compromise.
Requires “number particles,” “max iterations,” “cognitive weight,” “social weight.”
 
Avoiding Over-fitting
What is it?
Symptom: Model is great on predicting existing data, but fails miserably on new data.
Roulette example: red, red, black, red, red, black, red, red, black, red, red, ??
A serious problem for all classification/prediction techniques, not just neural networks.
 
Five most common techniques
Use lots of training data.
Train-Validate-Test (early stop when error on validate set begins to increase).
K-fold cross validation.
Repeated sub-sampling validation.
Jittering: deliberately adding noise data to make over-fitting impossible.
Quite a few exotic techniques also available (weight penalties, Bayesian learning, etc.).
 

Summary

Existing neural network tools are difficult or impossible to integrate into a software system.
Commercial and Open Source API libraries work well for some machine learning tasks but are extremely limited for neural networks.
To develop neural networks using Visual Studio you must understand seven core concepts: feed-forward, activation, data encoding, error, training, free parameters, and over-fitting.
Once the concepts are mastered, implementation with Visual Studio
is not difficult (but not easy either).