欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

prediction.io的安裝過程 博客分类: big data predicion.io推荐系统大数据 

程序员文章站 2024-03-16 11:20:22
...
今天終於將prediction.io裝上VM了。遇到好多問題。

參考:
https://docs.prediction.io/install/
https://docs.prediction.io/install/install-linux/(ubuntu獨立安裝看這裡,推薦)
https://docs.prediction.io/start/download/


如果DB用的不是hbase。如mysql,就不需要安裝Elasticsearch

pio/conf/pio-env.sh 要將DATABASE的設置改成mysql。
最後用pio status 檢查是否安裝成功。

pio命令要執行
export PATH=/home/mochong/PredictionIO-0.9.3/bin:$PATH
才能生效。或者將它放入系統的bin中。
參考:
http://*.com/questions/30417280/predictionio-pio-command-not-found-after-install

本機訪問VMWARE的service,必須將vmware的鏈接設置為橋接再重啟vm。
每次重啟要重新獲取vm的ip.
執行 sudo ifconfig 查看。

pio path也要重新設置到PATH中。

查看端口佔用情況:
sudo netstat -antup

demo:
http://192.168.3.111:7070/events.json?accessKey=qt61uzFbC8PZiVPtpTxSEM0nAcIns2tLOFbDiblL2GkkuKvoKokJGMq8IZ1h11FY
剛開始沒消息,返回
{"message":"Not Found"}

當執行rate_item.php後
rate_item.php
<?php
	require_once("vendor/autoload.php");
	use predictionio\EventClient;
	// 第三個參數是連接超時時間,如果連接超時,屏幕不會打印任何東西出來。
	$client = new EventClient("qt61uzFbC8PZiVPtpTxSEM0nAcIns2tLOFbDiblL2GkkuKvoKokJGMq8IZ1h11FY",
	"http://192.168.3.111:7070", 0, 1); // please update; <URL OF EVENTSERVER> = "http://localhost:7070" by default

	// A user rates an item
	$result = $client->createEvent(array(
	   'event' => 'rate',
	   'entityType' => 'user',
	   'entityId' => 'davidhuang', // please update
	   'targetEntityType' => 'item', 
	   'targetEntityId' => 'sugar_1', // please update
	   'properties' => array('rating'=> "5") // please update
	));
	
	var_dump($result);
?>


執行後,再刷看event.json?xxx 的瀏覽器,顯示新插入的數據內容:
[{"eventId":"5065b46a5c68428fb8bb0509caa12d93","event":"rate","entityType":"user","entityId":"davidhuang","targetEntityType":"item","targetEntityId":"sugar_1","properties":{"rating":"5"},"eventTime":"2015-07-31T07:27:33.000Z","creationTime":"2015-07-31T07:27:33.000Z"},{"eventId":"e00eb663c6f7417191c7984f31eacb76","event":"rate","entityType":"user","entityId":"davidhuang","targetEntityType":"item","targetEntityId":"sugar_1","properties":{"rating":"5"},"eventTime":"2015-07-31T07:27:44.000Z","creationTime":"2015-07-31T07:27:44.000Z"}]



------------------
PHP Client Setting
1 首先要安裝composer
參考:
http://taojinqu.blog.51cto.com/7849570/1351231
http://my.oschina.net/u/948242/blog/148269

安裝後提示:
Composer successfully installed to: /home/mochong/composer.phar
Use it: php composer.phar


----------------------DATABASE-----------
PIO的表有以下幾個

| pio_event_1                  |
| pio_meta_accesskeys          |
| pio_meta_apps                |
| pio_meta_channels            |
| pio_meta_engineinstances     |
| pio_meta_enginemanifests     |
| pio_meta_evaluationinstances |
| pio_model_models  

pio_event_1:保存event下的entity。
pio_meta_accesskeys:accesskey表
pio_meta_apps:app list 表


ubuntu mysql phpmyadmin安裝參考
http://blog.csdn.net/tecn14/article/details/27515241

使用engine-template
1 下載engin template
pio template get <template-repo-path> <your-new-engine-directory>
參考https://docs.prediction.io/start/download/
如果出现.temlates-cache cannot be written to
则将当前目录的.temlates-cache的文件权限设置为777
sudo chmod 777 .temlates-cache


2 cd 進入下載的engine template 目錄,修改engine.json,將datasource的appname改成自己的appname。
"datasource": {
    "params" : {
      "name": "sample-handmade-data.txt",
      "appName": "firstpio",
      "eventNames": ["purchase", "view"]
    }
  },


3 安裝python-pip
sudo apt-get install python-pip

4 在engine template 目錄下執行
a python data/import_eventserver.py --access_key NXDJ8UDK73uaynVwbih2MSGNoy07CVlRJXu16tsj5sFMOwHldpjtZ1sbMGougSvB
生成data。

b pio build --verbose
出現錯誤:
Could not retrieve sbt 0.13.7
pio build Return code of previous step is 1. Aborting.
參考:
https://groups.google.com/forum/#!topic/predictionio-user/fllCh8n-0d4
從這裡
https://d29vzk4ow07wi7.cloudfront.net/b407b2a76ad72165f806ac7e7ea09132b951ef53?response-content-disposition=attachment%3Bfilename%3D%22sbt-launch.jar%22&Policy=eyJTdGF0ZW1lbnQiOiBbeyJSZXNvdXJjZSI6Imh0dHAqOi8vZDI5dnprNG93MDd3aTcuY2xvdWRmcm9udC5uZXQvYjQwN2IyYTc2YWQ3MjE2NWY4MDZhYzdlN2VhMDkxMzJiOTUxZWY1Mz9yZXNwb25zZS1jb250ZW50LWRpc3Bvc2l0aW9uPWF0dGFjaG1lbnQlM0JmaWxlbmFtZSUzRCUyMnNidC1sYXVuY2guamFyJTIyIiwiQ29uZGl0aW9uIjp7IkRhdGVMZXNzVGhhbiI6eyJBV1M6RXBvY2hUaW1lIjoxNDM4MzMyNTA0fSwiSXBBZGRyZXNzIjp7IkFXUzpTb3VyY2VJcCI6IjAuMC4wLjAvMCJ9fX1dfQ__&Signature=qYv5RDh0C1sHIl7qYfRqyCtugYW1IEJwpO~p6BJjVRqNkGQplkwmpcH6T-1g69YOuI7wsw8xa5Gy3jstjTYjrackGIIdvnfqB3NYQ5yaehuwyexE1grUuhyF2HNXBsXNPuPvPK0X73tHdusEpeIj5FS-nK6u5kRcxaOBIcNO0tYD4db2p~Tcg4a78LzfvFm~r5doZ8JO2Qv9IgidcpdWCB0rDTAYfm2wO-my~bA8Qjh8CFJl43JGz-FCVhYu7wJri1wIulGU~2ZNkZDrVooiEdO4W~luiw5odmme~vXeiV1bFt-xIkPNMa6uK0PVn5ePOucnVeFSS62kyAPdYgvCQw__&Key-Pair-Id=APKAIFKFWOMXM2UMTSFA

下載一個新的sbt-launch.jar替換sbt/sbt-launch-0.13.7.jar

執行後的提示信息:
[INFO] [Console$] Using existing engine manifest JSON at /home/mochong/PredictionIO-0.9.3/templates/tapster-episode-similar/manifest.json
[INFO] [Console$] Using command '/home/mochong/PredictionIO-0.9.3/sbt/sbt' at the current working directory to build.
[INFO] [Console$] If the path above is incorrect, this process will fail.
[INFO] [Console$] Uber JAR disabled. Making sure lib/pio-assembly-0.9.3.jar is absent.
[INFO] [Console$] Going to run: /home/mochong/PredictionIO-0.9.3/sbt/sbt  package assemblyPackageDependency
[INFO] [Console$] OpenJDK 64-Bit Server VM warning: INFO: os::commit_memory(0x00000000c5550000, 715849728, 0) failed; error='Cannot allocate memory' (errno=12)
[INFO] [Console$] #
[INFO] [Console$] # There is insufficient memory for the Java Runtime Environment to continue.
[INFO] [Console$] # Native memory allocation (malloc) failed to allocate 715849728 bytes for committing reserved memory.
[INFO] [Console$] # An error report file with more information is saved as:
[INFO] [Console$] # /home/mochong/PredictionIO-0.9.3/templates/tapster-episode-similar/hs_err_pid8872.log
[ERROR] [Console$] Return code of previous step is 1. Aborting.



JAVA執行內存不夠。將VM的虛擬內存設置為2048試試。

還要將tempalte中的pio.log,pio.sbt的權限設置為777.不然會提示權限不足。


日誌會保存到pio.log

最終還是失敗了。提示 Could not retrieve sbt 0.13.7


c pio train

d pio deploy

參考:https://docs.prediction.io/deploy/

engin的默認端口是8000

----------------------------------------
export app data
pio export --appid 1 --output firstapp
会在根目录下创建firstapp,数据导出在这里。

-----------------------------------------------------------------
事件模型 event models
All entityType names start with "$" and "pio_" are reserved and shouldn't be used.

All event names start with "$" and "pio_" are reserved
and shouldn't be used as your custom event name (eg. "$set").

所有properties 以$和pio开头的属性,都是pio保留的名称,不能使用

The following special events are reserved for updating entities and their properties:
Special events $set , $unset and $delete are introduced.
"$set" event: Set properties of an entity (also implicitly create the entity). To change properties of entity, you simply set the corresponding properties with value again. The $set events should be created only when:
The entity is first created (or re-create after $delete event), or
Set the entity's existing or new properties to new values (For example, user updates his email, user adds a phone number, item has a updated categories)
"$unset" event: Unset properties of an entity. It means treating the specified properties as not existing anymore. Note that the field properties cannot be empty for $unset event.
"$delete" event: delete the entity.
There is no targetEntityId for these special events.

$set , $unset and $delete 是系统保留的属性。不能使用。

参考:
https://docs.prediction.io/datacollection/eventmodel/

---------------------------------------------------

批量导入数据
PHP SDK还不支持。但是可以将数据导出成json文件。在shell中通过pio import 导入
 pio import --appid 123 --input my_events.json

导入成功后,就会在对应appid的event表中看到导入的数据
参考:
https://docs.prediction.io/datacollection/batchimport/
-----------------------------------------------

分析工具Ipython

下载安装ipython
http://ipython.org/install.html
执行
pip install ipython


参考:
http://ipython.org/install.html

http://mindonmind.github.io/2013/02/08/ipython-notebook-interactive-computing-new-era/

------------------------------------------
Event API
1 通过eventID 获取event数据
http://192.168.3.111:7070/events/0d1eb5c7d2eb4f2db85f66c514a46143.json?accessKey=qt61uzFbC8PZiVPtpTxSEM0nAcIns2tLOFbDiblL2GkkuKvoKokJGMq8IZ1h11FY

返回
{"eventId":"0d1eb5c7d2eb4f2db85f66c514a46143","event":"buy","entityType":"user","entityId":"3","targetEntityType":"item","targetEntityId":"0","properties":{},"eventTime":"2014-11-21T01:04:14.000Z","creationTime":"2015-07-31T12:27:45.000Z"}


2 删除event
curl -i -X DELETE http://localhost:7070/events/<your_eventId>.json?accessKey=<your_accessKey>

3 app的所有event
http://192.168.3.111:7070/events.json?accessKey=qt61uzFbC8PZiVPtpTxSEM0nAcIns2tLOFbDiblL2GkkuKvoKokJGMq8IZ1h11FY

By default, it returns at most 20 events. Use the limit parameter to specify how many events returned (see below). Use cautiously!

In addition, the following optional parameters are supported:

startTime: time in ISO8601 format. Return events with eventTime >= startTime.
untilTime: time in ISO8601 format. Return events with eventTime < untilTime.
entityType: String. The entityType. Return events for this entityType only.
entityId: String. The entityId. Return events for this entityId only.
limit: Integer. The number of record events returned. Default is 20. -1 to get all.
reversed: Boolean. Must be used with both entityType and entityId specified, returns events in reversed chronological order. Default is false.

参考:
https://docs.prediction.io/datacollection/eventapi/#using-event-api

--------------------------------------------------------------------------
##PredictionIO Step by Step Setup Guide v1.0

**A. SERVER SIDE**

1. Ensure you have an appropriate Java version (JDK 7) installed *(if you use quick install command in https://docs.prediction.io/install/, skip steps 1 to 4 and goto step 5)*

java -version

2. Download JDK 7 in http://www.oracle.com/technetwork/java/javase/downloads/jdk7-downloads-1880260.html and install it (optional)
3. Download PredictionIO

wget https://d8k1yxp8elc6b.cloudfront.net/PredictionIO-0.9.3.tar.gz
tar zxvf PredictionIO-0.9.3.tar.gz
4. Install dependencies

mkdir PredictionIO-0.9.3/vendors
wget http://d3kbcqa49mib13.cloudfront.net/spark-1.3.1-bin-hadoop2.6.tgz
tar zxvfC spark-1.3.1-bin-hadoop2.6.tgz PredictionIO-0.9.3/vendors
5. Create database "pio" *(assume you have database server up and running)*

6. Update pio-env.sh *(especially database settings)*

vi PredictionIO-0.9.3/conf/pio-env.sh
7. Start PredictionIO and dependent services

PredictionIO-0.9.3/bin/pio eventserver &

print "EventServer is ready" if ok.
8. Check PredictionIO status

PredictionIO-0.9.3/bin/pio status
print "your system is all already to go"
9. Download a Template (from http://templates.prediction.io/) i.e. Recommendation template

PredictionIO-0.9.3/bin/pio template get PredictionIO/template-scala-parallel-recommendation <YourNewEngineDir> // i.e. MyRecommendation
cd MyRecommendation
10. Generate an App ID and Access Key

../PredictionIO-0.9.3/bin/pio app new MyApp1
11. List App

../PredictionIO-0.9.3/bin/pio app list

**B. CLIENT SIDE**

*Assume:*
*[weburl] = http://localhost:8888/predictionio-myapp1*
*[webdir] = /Application/MAMP/htdocs/predictionio-myapp1*

1. Download PredictionIO PHP SDK *(By Composer)*

cd [webdir]/
1.1 Create [webdir]/composer.json

{
    "require": {
        "predictionio/predictionio": "~0.8.2"
    }
}
1.2 Install Composer

curl -sS https://getcomposer.org/installer | php -d detect_unicode=Off
1.3 Use Composer to install your dependencies

php composer.phar install
2. Create and update [webdir]/rate-item.php

<?php
require_once("vendor/autoload.php");
use predictionio\EventClient;

$client = new EventClient(<ACCESS KEY>, <URL OF EVENTSERVER>); // please update; <URL OF EVENTSERVER> = "http://localhost:7070" by default

// A user rates an item
$client->createEvent(array(
   'event' => 'rate',
   'entityType' => 'user',
   'entityId' => <USER ID>, // please update
   'targetEntityType' => 'item',
   'targetEntityId' => <ITEM ID>, // please update
   'properties' => array('rating'=> <RATING>) // please update
));
?>
3. Browse http://localhost:8888/predictionio-myapp1/rate-item.php to insert data into database

**C. SERVER SIDE**

12. Check if query event server works fine
Browse http://localhost:7070/events.json?accessKey=[YOUR_ACCESS_KEY]

13. Deploy the engine as a service

cd MyRecommendation

14. Update "appName" in engine.json

...
  "datasource": {
    "params" : {
      "appName": "MyApp1" // must match your app name
    }
  },
  ...

15. Build the engine

../PredictionIO-0.9.3/bin/pio build --verbose

16. Train the predictive model

../PredictionIO-0.9.3/bin/pio train

*Just in case you got java.net.UnknownHostException, please check /etc/hosts. (Read more: http://*.com/questions/19330334/hadoop-on-mac-pseudo-node-nodename-nor-servname-provided-or-not-known)*

17. Deploy the engine

../PredictionIO-0.9.3/bin/pio deploy

18. Check the engine status by browsing http://localhost:8000


**D. CLIENT SIDE**

4. Create [webdir]/send-query.php

<?php
require_once("vendor/autoload.php");
use predictionio\EngineClient;

$client = new EngineClient('http://localhost:8000');

$response = $client->sendQuery(array('user'=> 1, 'num'=> 4));
print_r($response);

?>

5. Browse http://localhost:8888/predictionio-myapp1/send-query.php

**E. REFERENCES**

1. PredictionIO - A Quick Intro
https://docs.prediction.io/start/

2. Installing PredictionIO on Linux / Mac OS X
https://docs.prediction.io/install/install-linux/

3. Java SE Development Kit 7 Downloads
http://www.oracle.com/technetwork/java/javase/downloads/jdk7-downloads-1880260.html

4. PredictionIO Engine Templates
http://templates.prediction.io/

5. Quick Start - Recommendation Engine Template
https://docs.prediction.io/templates/recommendation/quickstart/

6. PredictionIO-PHP-SDK
https://github.com/PredictionIO/PredictionIO-PHP-SDK
  • prediction.io的安裝過程
            
    
    博客分类: big data predicion.io推荐系统大数据 
  • 大小: 19.1 KB
  • prediction.io的安裝過程
            
    
    博客分类: big data predicion.io推荐系统大数据 
  • 大小: 8.3 KB
  • prediction.io的安裝過程
            
    
    博客分类: big data predicion.io推荐系统大数据 
  • 大小: 18 KB