Tuesday, 12 April 2016

How to []byte (byte array) to json in golang

Converting a json byte array to son/map object in golang.

//below is just a declaration of a map pointer holder, no map object is instantiatedvar dat map[string]string
//Now unmarshall the byte array into assign the map object to the pointer in second argumentif err := json.Unmarshal(body,&dat); err != nil {
   panic(err)
}
fmt.Println(dat)

how to []byte to string in golang?

1) print formatted string,  byte array

fmt.Printf("%s\n",body)


2) byte array to string

buff := bytes.NewBuffer(body)
fmt.Println(buff.String())




Monday, 11 April 2016

go / golang difference between := vs =

:= is for declaration + assignment, and = is for assignment only
For the golang documentation:

Inside a function, the := short assignment statement can be used in place of a var declaration with implicit type.

Outside a function, every construct begins with a keyword (var, func, and so on) and the := construct is not available.

In the below example:
the usage of both := and  = operator can be seen.

func Reverse(s string) string {
 r := []rune(s)
 for i,j := 0,len(r)-1 ; i < len(r)/2 ; i,j=i+1,j-1 {
  r[i],r[j] = r[j],r[i]
 }
 return string(r)
}


what is rune in go

Like other programming languages golang doesn't have character type.

So how are characters represented in a string?

ASCII charters in a string are represented by a Byte.
UNICODE characters in a string are represented by a RUNE.

In reality they are both just aliases for integer types (uint8 and int32).

if you want to force them to be printed as characters instead of numbers, you need to use Printf("%c", x). The %c format specification works for any integer type.

In the below program, the rune function is used to return the right character representation of an String based on the underlying storage (uint8,int32) .

package main

import "fmt"

func main() {
    fmt.Println(string([]rune("Hello, 日本語")[1])) // UTF-8
    fmt.Println(string([]rune("Hello, 日本語")[8])) // UTF-8}

go initial errors "cannot refer to unexported name fmt.printf"


I'm getting started with golang, just recording my initial hiccups and the solutions.

cannot refer to unexported name fmt.printf
hello/hello.go:6: undefined: fmt.printf

Golang is case sensitive , hence the function is fmt.Printf , with a capital P.

In Go, a name is exported if it begins with a capital letter. For example, Pizza is an exported name, as is Pi, which is exported from the math package.

Friday, 8 April 2016

What is Elasticsearch Analyzers?

Analyzers are pre-processors which are executed on the Text before generating inverted index. 

So why do we need Analyzers in Elasticsearch?

Consider a field which we want have inverted-index

Document 1: "Hello Elasticsearch World "
Document 2: "hello Elasticsearch world"

The Inverted Index looks like this:

Term(Document,Frequency)
Hello(1,1)
World(1,1)
Elasticsearch(1,1),(2,1)
hello(2,1)
world(2,1)

Its evident that "Hello" and "hello" are same words with change in case, As there are two separate indexes for the same word , the query returns partial result when we need a case insensitive search.

To fix the above issue,Now lets have the words converted to lowercase before creating the inverted-index.

Term(Document,Frequency)
hello(1,1),(2,1)
world(1,1),(2,1)
Elasticsearch(1,1),(2,1)

This is what analyzers in Elasticsearch are for. This a simple illustration and analyzers are designed to do more than the example used for illustration.

In simple terms an Analyzer does:
  1. Split the text into individual terms or token, based on whitespace.
  2. Standardize the individual terms so they are searchable.



What is Inverted Index in Elasticsearch

Elasticsearch is a Full Text Search Engine built on top of Lucene. The index structure used in Lucene to enable fast search lookups is called "Inverted Index". 

Inverted index is a simple concept, but yet powerful to enable efficient search. Its a list of unique words(terms) that appear in the documents stored. Each unique words map to list of documents in which they appear along with how many times the word has occurred in the document (Frequency).