码迷,mamicode.com
首页 > 其他好文 > 详细

spark mllib prefixspan demo

时间:2019-04-23 12:46:53      阅读:252      评论:0      收藏:0      [点我收藏+]

标签:text   src   cal   pytho   res   context   test   python   local   

./bin/spark-submit ~/src_test/prefix_span_test.py

 

source code:

import os
import sys 
from  pyspark.mllib.fpm import PrefixSpan
from pyspark import SparkContext
from pyspark import SparkConf

sc = SparkContext("local","testing")
print(sc)
data = [ 
   [[‘a‘],["a", "b", "c"], ["a","c"],["d"],["c", "f"]],
   [["a","d"], ["c"],["b", "c"], ["a", "e"]],
   [["e", "f"], ["a", "b"], ["d","f"],["c"],["b"]],
   [["e"], ["g"],["a", "f"],["c"],["b"],["c"]]
   ]   
rdd = sc.parallelize(data, 2)
model = PrefixSpan.train(rdd, 0.5,4)
result = sorted(model.freqSequences().collect())
print("*"*88)
print(result)
print("*"*88)

 output:

 

****************************************************************************************
[FreqSequence(sequence=[[‘a‘]], freq=4), FreqSequence(sequence=[[‘a‘], [‘a‘]], freq=2), FreqSequence(sequence=[[‘a‘], [‘b‘]], freq=4), FreqSequence(sequence=[[‘a‘], [‘b‘], [‘a‘]], freq=2), FreqSequence(sequence=[[‘a‘], [‘b‘], [‘c‘]], freq=2), FreqSequence(sequence=[[‘a‘], [‘b‘, ‘c‘]], freq=2), FreqSequence(sequence=[[‘a‘], [‘b‘, ‘c‘], [‘a‘]], freq=2), FreqSequence(sequence=[[‘a‘], [‘c‘]], freq=4), FreqSequence(sequence=[[‘a‘], [‘c‘], [‘a‘]], freq=2), FreqSequence(sequence=[[‘a‘], [‘c‘], [‘b‘]], freq=3), FreqSequence(sequence=[[‘a‘], [‘c‘], [‘c‘]], freq=3), FreqSequence(sequence=[[‘a‘], [‘d‘]], freq=2), FreqSequence(sequence=[[‘a‘], [‘d‘], [‘c‘]], freq=2), FreqSequence(sequence=[[‘a‘], [‘f‘]], freq=2), FreqSequence(sequence=[[‘b‘]], freq=4), FreqSequence(sequence=[[‘b‘], [‘a‘]], freq=2), FreqSequence(sequence=[[‘b‘], [‘c‘]], freq=3), FreqSequence(sequence=[[‘b‘], [‘d‘]], freq=2), FreqSequence(sequence=[[‘b‘], [‘d‘], [‘c‘]], freq=2), FreqSequence(sequence=[[‘b‘], [‘f‘]], freq=2), FreqSequence(sequence=[[‘b‘, ‘a‘]], freq=2), FreqSequence(sequence=[[‘b‘, ‘a‘], [‘c‘]], freq=2), FreqSequence(sequence=[[‘b‘, ‘a‘], [‘d‘]], freq=2), FreqSequence(sequence=[[‘b‘, ‘a‘], [‘d‘], [‘c‘]], freq=2), FreqSequence(sequence=[[‘b‘, ‘a‘], [‘f‘]], freq=2), FreqSequence(sequence=[[‘b‘, ‘c‘]], freq=2), FreqSequence(sequence=[[‘b‘, ‘c‘], [‘a‘]], freq=2), FreqSequence(sequence=[[‘c‘]], freq=4), FreqSequence(sequence=[[‘c‘], [‘a‘]], freq=2), FreqSequence(sequence=[[‘c‘], [‘b‘]], freq=3), FreqSequence(sequence=[[‘c‘], [‘c‘]], freq=3), FreqSequence(sequence=[[‘d‘]], freq=3), FreqSequence(sequence=[[‘d‘], [‘b‘]], freq=2), FreqSequence(sequence=[[‘d‘], [‘c‘]], freq=3), FreqSequence(sequence=[[‘d‘], [‘c‘], [‘b‘]], freq=2), FreqSequence(sequence=[[‘e‘]], freq=3), FreqSequence(sequence=[[‘e‘], [‘a‘]], freq=2), FreqSequence(sequence=[[‘e‘], [‘a‘], [‘b‘]], freq=2), FreqSequence(sequence=[[‘e‘], [‘a‘], [‘c‘]], freq=2), FreqSequence(sequence=[[‘e‘], [‘a‘], [‘c‘], [‘b‘]], freq=2), FreqSequence(sequence=[[‘e‘], [‘b‘]], freq=2), FreqSequence(sequence=[[‘e‘], [‘b‘], [‘c‘]], freq=2), FreqSequence(sequence=[[‘e‘], [‘c‘]], freq=2), FreqSequence(sequence=[[‘e‘], [‘c‘], [‘b‘]], freq=2), FreqSequence(sequence=[[‘e‘], [‘f‘]], freq=2), FreqSequence(sequence=[[‘e‘], [‘f‘], [‘b‘]], freq=2), FreqSequence(sequence=[[‘e‘], [‘f‘], [‘c‘]], freq=2), FreqSequence(sequence=[[‘e‘], [‘f‘], [‘c‘], [‘b‘]], freq=2), FreqSequence(sequence=[[‘f‘]], freq=3), FreqSequence(sequence=[[‘f‘], [‘b‘]], freq=2), FreqSequence(sequence=[[‘f‘], [‘b‘], [‘c‘]], freq=2), FreqSequence(sequence=[[‘f‘], [‘c‘]], freq=2), FreqSequence(sequence=[[‘f‘], [‘c‘], [‘b‘]], freq=2)]
****************************************************************************************

spark mllib prefixspan demo

标签:text   src   cal   pytho   res   context   test   python   local   

原文地址:https://www.cnblogs.com/bonelee/p/10755622.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!