分享到:

SolrCloud安装

关于搭建solrCloud + zookeeper +HDFS的实践经验

 

 

搭建zookeeper

1、        zookeeper官网下载zookeeper(用于管理solrcloud云的配置文件)http://mirror.bit.edu.cn/apache/zookeeper/zookeeper-3.4.6/

2、        准备三台服务器,或者搭建3台虚拟机:                                                 例如:host3.com   192.168.2.87    

Host5.com   192.168.2.89

Host4.com   192.168.2.94

3、        上传zookeeper-3.4.6.tar.gz到任意一台服务器/usr/local/目录,并解压到当前目录:zookeeper tar –zxvf zookeeper-3.4.6.tar.gz                            改名:zookeeper-3.4.6 zookeepermv  zookeeper-3.4.6   zookeeper

4、        zookeeper目录下建立data logs目录,同时将zookeeper目录下conf/zoo_simple.cfg文件复制一份成 zoo.cfg

5、        修改zoo.cfg  

# The number of milliseconds of each tick

tickTime=2000

# The number of ticks that the initial

# synchronization phase can take

initLimit=10

# The number of ticks that can passbetween

# sending a request and getting anacknowledgement

syncLimit=5

# the directory where the snapshot isstored.

# do not use /tmp for storage, /tmp hereis just

# example sakes.

dataDir=/usr/local/zookeeper/data  

# the port at which the clients willconnect

clientPort=2181

# the maximum number of clientconnections.

# increase this if you need to handle moreclients

#maxClientCnxns=60

#

# Be sure to read the maintenance sectionof the

# administrator guide before turning onautopurge.

#

#http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance

#

# The number of snapshots to retain indataDir

#autopurge.snapRetainCount=3

# Purge task interval in hours

# Set to "0" to disable autopurge feature

#autopurge.purgeInterval=1

dataLogDir=/usr/local/zookeeper/logs

server.1=192.168.2.89:2888:3888

server.2=192.168.3.87:2888:3888

server.3=192.168.3.94:2888:3888

 

6、    拷贝zookeeper目录到另外两台服务器:                                                                    

  

scp-r /usr/local/zookeeper root@192.168.2.87:/usr/local/

scp–r /usr/local/zookeeper root@192.168.2.89:/usr/local/

 

分别在几台服务器的data目录下建立myid  ip对应相应的server.*   server.1 myid内容为1  server.2myid内容为server.3myid 3  

7、        启动ZooKeeper集群,在每个节点上分别启动ZooKeeper服务:

8、  可以查看ZooKeeper集群的状态,保证集群启动没有问题:分别查看每台服务器的zookeeper状态 zookeeper#bin/zkServer.shstatus查看那些是following那个是leader

Eg:  

[root@host4zookeeper-3.3.6]# bin/zkServer.sh status

JMXenabled by default

Usingconfig: /home/hadoop/zookeeper-3.3.6/bin/../conf/zoo.cfg

Mode:follower

[root@host5/]# cd /home/hadoop/zookeeper-3.3.6/

[root@host5zookeeper-3.3.6]# bin/zkServer.sh status

JMXenabled by default

Usingconfig: /home/hadoop/zookeeper-3.3.6/bin/../conf/zoo.cfg

Mode:leader

 

[root@host3multicore]# cd /home/hadoop/zookeeper-3.3.6/

[root@host3zookeeper-3.3.6]# bin/zkServer.sh status

JMXenabled by default

Usingconfig: /home/hadoop/zookeeper-3.3.6/bin/../conf/zoo.cfg

Mode:follower

 

建立SolrCloud

1、        Apache官网下载solr安装文件 solr-4.10.1.tgz    http://mirror.bit.edu.cn/apache/lucene/solr/4.10.1/                            并解压tar –xvzf solr-4.10.1.tgz                                                         更改solr-4.10.1目录:mv solr-4.10.1 solr

2、         创建在根目录下创建目录mkdir –p /data0/solrcloud    solr目录移动到data0下,那么data0下包含两个目录{solrsolrcloud}

3、        /data0/solr/example/webapps/solr.war放到tomcatwebapps目录下,启动tomcat,这是tomcat下多出solr目录

4、        /data0/solr/example/lib/ext/下的所有的jar文件复制到tomcat/webapps/solr/WEB-INF/lib/下,建立mkdir –p /data0/solrcloud/{multicoresolr-lib}目录,在将tomcat/webapps/solr/WEB-INF/lib/* 复制一份到solr-lib/ 

cp  /usr/local/tomcat/webapps/solr/WEB-INF/lib/*  /data0/solrcloud/solr-lib/

 

 

 

5、        在建立一个装在配置文件的目录

 

mkdir–p  /data0/solrcloud/multicore/collection/{conf,data}  

 

同时将 /data0/solr/ example/solr/collection1/conf/* 

复制到/data0/solrcloud/multicore/collection/conf目录下

 

example/solr/multicore下的solr.xmlzoo.cfg复制到/data0/solrcloud/multicore目录下 eg

 

 

Collection目录:

 

 Data-config.xml文件是数据导入的配置文件查看:

 

http://blog.csdn.net/john_hongming/article/details/40181451

 

需要自己建立的 solrcore.properties文件

文件内容:

solr.shard.data.dir=/data0/solrcloud/multicore/collection/data

说明:属性solr.shard.data.dirsolrconfig.xml文件中被引用过,指定索引数据的存放位置。

 

Solr.xml文件的说明:

 

 

 

6、        通过zookeeper管理配置文件:

#zookeeper上传配置文件 #

java-classpath .:/data0/solrcloud/solr-lib/* org.apache.solr.cloud.ZkCLI -cmdupconfig -zkhost host3.com:2181,host4.com:2181,host5.com:2181 -confdir/data0/solrcloud/multicore/collection/conf -confname myconf

#zookeeper连接多个节点#

java-classpath .:/data0/solrcloud/solr-lib/* org.apache.solr.cloud.ZkCLI -cmdlinkconfig -collection collection1 -confname myconf -zkhosthost3.com:2181,host4.com:2181,host5.com:2181

 

7、        在启动文件tomcat/bin/Catalina.sh中添加如下配置:

#配置tomcat/bin/catlina.sh下的启动参数#

JAVA_OPTS="-server -Xmx2048m-Xms1024m -verbose:gc  -Xloggc:solr_gc.log -Dsolr.solr.home=/data0/solrcloud/multicore 

-DzkHost=host3.com:2181,host4.com:2181,host5.com:2181"

 

8、        修改tomcat/webapps/solr/WEB-INF/web.xml

context.xml

      




    
    WEB-INF/web.xml

    
    

    
    

web.xml


       solr/home
       /data0/solrcloud/multicore
       java.lang.String

 


       solr/home
       /data0/solrcloud/multicore
       java.lang.String
    

 

9、        SolrcloudIK分词器的配置:

首先在solrcloud中的multicore/collection/下建立lib目录,将IK分词器的配置文件最主要的就是IKAnalyzer.cfg.xml stopword.dic移动到lib

之后修改multicore/collection/conf下的schema.xml文件

添加:

name="ikanalyzer"class="solr.TextField">

type="index"isMaxWordLength="false"class="org.wltea.analyzer.lucene.IKAnalyzer"/>

type="query"isMaxWordLength="true"class="org.wltea.analyzer.lucene.IKAnalyzer"/>

type="multiterm">

class="solr.KeywordTokenizerFactory"/>

Field会根据type的属性进行分词

至此,IKAnalyzer中文分词基本添加完成,更新下zookeepersolr配置:

java -classpath .:/usr/local/solrcloud/solr-lib/*org.apache.solr.cloud.ZkCLI -cmd upconfig -zkhost 192.168.3.119:2181,192.168.3.111:2181,192.168.3.127:2181 -confdir/usr/local/solrcloud/multicore/collection/conf -confname myconf

 

如果要添加扩展词典:

在tomcat/webapps/solr/WEB-INF/下建立classes目录将要添加的词典和配置文件都放在该目录下

 

 

 

编辑IKAnalyzer.cfg.xml 添加词典

如下图:

10、     将配置好的data0目录 scp到另外两个服务器:

scp –r /data0 root@192.168.2.89:/

scp –r /data0root@192.168.2.87:/

 

scp –r /usr/local/tomcat root@192.168.2.89:/usr/local/

scp –r /usr/local/tomcat root@192.168.2.87:/usr/local/ 

启动三台服务器bin/startup.sh start

 

11、     创建collectionshard

 

#创建collection 3一个副本集#

#创建三个分片,每个分片一个副本集#

curl 'http://192.168.2.89:8080/solr/admin/collections?action=CREATE&name=mycollection&numShards=3&replicationFactor=1'    

 #创建shard 的副本  89创建shard1的副本集mycollection_shard1_replica_2#

curl'http://192.168.2.89:8080/solr/admin/cores?action=CREATE&collection=mycollection&name=mycollection_shard1_replica_2&shard=shard1'

curl'http://192.168.2.87:8080/solr/admin/cores?action=CREATE&collection=mycollection&name=mycollection_shard1_replica_3&shard=shard1'

curl'http://192.168.2.89:8080/solr/admin/cores?action=CREATE&collection=mycollection&name=mycollection_shard2_replica_2&shard=shard2'

curl'http://192.168.2.87:8080/solr/admin/cores?action=CREATE&collection=mycollection&name=mycollection_shard2_replica_3&shard=shard2'

#shard1再次在94shard #

curl'http://192.168.2.94:8080/solr/admin/collections?action=SPLITSHARD&collection=mycollection&shard=shard1'

 

 

 

 

 

 

 

1、         编辑/usr/local/tomcat/bin/catlina.sh 添加红色部分

 

JAVA_OPTS="-server-Xmx2048m -Xms1024m -verbose:gc -Xloggc:solr_gc.log -XX:MaxDirectMemorySize=1g -Dsolr.directoryFactory=HdfsDirectoryFactory-Dsolr.lock.type=hdfs -Dsolr.hdfs.home=hdfs://host1xyz.com:9000/solr -Dsolr.solr.home=/data0/solrcloud/multicore-DzkHost=host3.com:2181,host4.com:2181,host5.com:2181"

2、        修改/data0/solrcloud/multicore/collection/conf/solrconfig.xml文件

添加这部分:

    
   hdfs://host1xyz.com:9000/solr  
   true

   1

   true

   16384

   true

   true

   true

   16

   192

    /home/hadoop/hadoop-2.2.0/etc/hadoop

  

 

 

再找到

${solr.lock.type:native}将其修改为${solr.lock.type:hdfs}

 

注意:这时${solr.data.dir:}是这种状态,如果添加路径,就会覆盖掉hdfs的路径

 

参考:

https://cwiki.apache.org/confluence/display/solr/Running+Solr+on+HDFS

http://shiyanjun.cn/archives/100.html

http://blog.csdn.net/shirdrn/article/details/9770829

http://blog.csdn.net/john_hongming/article/details/40113641

http://blog.csdn.net/john_hongming/article/details/40080947