PMXBOT Log file Viewer

Help | Karma | Search:

#mongodb logs for Wednesday the 20th of July, 2016

(Back to #mongodb overview) (Back to channel listing) (Animate logs)
[00:51:56] <starseed> I'm looking for a way to determine the age or creation date of a cursor. Anyone know a way to do this?
[02:01:00] <starseed> I'm looking for a way to determine the age or creation date of a cursor. Anyone know a way to do this?
[02:02:01] <cheeser> i don't think there is. what are you trying to find?
[02:14:54] <starseed> just what I said, the age of the cursor
[02:15:06] <starseed> We have bad code in an application that unfortunately depends on notimeout
[02:15:32] <starseed> so we have these zombie cursors popping up occasionally which block sharding chunk migrations
[02:16:39] <starseed> I am scraping the Cursor IDs out of our logs, then I need to figure out how long each cursor has existed and kill them if they are older than x hours
[02:21:42] <cheeser> i'm surprised they last longer that 10 minutes, tbh
[02:21:53] <cheeser> i thought that was the timeout on cursors.
[02:23:18] <cheeser> i'd start by grepping the source for that option and removing it.
[02:54:39] <starseed> that's the default timeout
[02:55:08] <starseed> you can specify notimeout, which means they live until killed via command or you restart the mongoD process
[02:56:06] <starseed> we have some very very long running queries. Due to shard distribution, sometimes we will open a query and access some data...then it'll be more than ten minutes until that query needs data located on the shard again. So the cursor would timeout if set to default and the query fails
[02:57:24] <starseed> our devs 'workaround' was to specify notimeout on the cursors...which frankly wouldn't be the end of the world, except that open cursors block chunk migrations.
[07:14:39] <skoude> hmm. .I would need a unique key for every single mondo document.. Should I use ObjedID or should I generate my own one? Any best practices for this?
[07:16:47] <joannac> if you have a built in unique key for your documents, then use that
[07:16:52] <joannac> otherwise use the objectID
[10:00:49] <jokke> hey
[10:01:56] <jokke> my disk space ran out.. i deleted old time series data and wanted to reclaim the space with db.repairDatabase() but i got the following error: Cannot repair database energybox having size: 478605737984 (bytes) because free disk space is: 2851266560 (bytes)
[10:02:02] <jokke> is there really nothing i can do?
[10:02:24] <Derick> you can setup a secondary, and let it sync
[10:02:34] <Derick> then tear down the first node and reconfigure the replicaset
[10:02:53] <jokke> mhm ok
[10:16:15] <jokke> Derick: does it sync automatically after reconfiguring the rs?
[10:16:34] <jokke> ah seems so
[10:39:43] <kedare> Hello o/
[10:39:57] <kedare> Simple question, is this still true ?
[10:39:59] <kedare> One MongoDB 3.2 feature we demoed at MongoDB World in June was $lookup. $lookup is an aggregation pipeline stage that lets you insert data from a different collection into your pipeline. This is effectively a left outer join. $lookup will only be included in MongoDB Enterprise Server"
[10:40:09] <kedare> Enterprise Server only ?
[10:40:42] <Derick> not true
[10:41:46] <Derick> that article links to: https://www.mongodb.com/blog/post/revisiting-usdlookup
[10:41:52] <kedare> Ok, I got a little WTF? when I saw this
[10:43:30] <kedare> Is this working as good are the SQL joins at the performance level ?
[10:43:57] <kedare> I mean is this a good idea to relies on them on a lot of queries ?
[11:41:52] <gzoo> On Linux, will a `service mongod stop` shutdown gracefully like `mongod --shutdown`?
[11:42:49] <Derick> it should, yes
[11:44:11] <gzoo> Derick, I find it uneasy not being mentioned on the official docs page for shutting down the server. Maybe I'm just being paranoid
[12:30:36] <atbe> Hey guys, for the logging done on the COMMAND component, how should we interpret the message?
[12:30:53] <atbe> for example the log message is as follows
[12:30:56] <atbe> I COMMAND [conn409274] command config.$cmd command: update { update: "mongos", updates: [ { q: { _id: "mdb32-rh7-s1-r:27017" }, u: { $set: { _id: "mdb32-rh7-s1-r:27017", ping: new Date(1469016660500), up: 148264, waiting: true, mongoVersion: "3.2.8" } }, multi: false, upsert: true } ], writeConcern: { w: "majority", wtimeout: 15000 }, maxTimeMS: 30000 }
[12:30:57] <atbe> keyUpdates:0 writeConflicts:0 numYields:0 reslen:386 locks:{ Global: { acquireCount: { r: 2, w: 2 } }, Database: { acquireCount: { w: 2 } }, Collection: { acquireCount: { w: 1 } }, Metadata: { acquireCount: { w: 1 } }, oplog: { acquireCount: { w: 1 } } } protocol:op_command 106ms
[12:31:59] <atbe> https://usercontent.irccloud-cdn.com/file/vXajN1G8/
[12:41:55] <atbe> Are there any static blocks running on the Java Mongo Driver?
[12:52:15] <cheeser> atbe: yes
[12:53:41] <atbe> I'm seeing some network discovery going on before I started to use my MongoDriver, is this by design? The mongo code is packaged in with some other code. But the mongo code had not been executed and it already started checking the network for mongo instances
[12:58:50] <cheeser> that's likely the driver discovering the cluster topology. i.e., trying to find the primary
[12:59:05] <cheeser> but that has nothing to do with anything static...
[13:03:13] <atbe> I see, could you point me to the code that does this?
[13:03:18] <atbe> on github?
[13:07:02] <cheeser> i don't know where it is offhand
[13:07:47] <atbe> I can clone the driver and grep the source code, any keywords to help narrow the search?
[13:08:21] <cheeser> isMaster
[13:12:51] <atbe> hmm, anything else?
[13:14:26] <cheeser> not really
[13:14:35] <CustosLimen> hi
[13:14:43] <CustosLimen> how do I get typeid of a field with mongo shell ?
[13:15:13] <atbe> cheeser: can I disable this discovery until the client is explicitly intialized?
[13:15:16] <cheeser> CustosLimen: https://docs.mongodb.com/manual/reference/operator/query/type/
[13:15:29] <cheeser> atbe: it already *is* explicitly initialized.
[13:17:39] <CustosLimen> That allows me to compare the field type
[13:17:41] <CustosLimen> not get it
[13:17:46] <CustosLimen> cheeser, I want to get the value
[13:17:53] <CustosLimen> comparing it to all values seems counter productive
[13:20:04] <cheeser> CustosLimen: https://docs.mongodb.com/manual/core/shell-types/#check-types-in-shell
[13:20:19] <atbe> cheeser: So even before I use `new MongoClient()`, a `MongoClient` instanced is already created as long as I include the library?
[13:20:31] <cheeser> of course, not.
[13:21:25] <CustosLimen> cheeser, does not give typeid
[13:21:51] <cheeser> typeof doesn't?
[13:22:04] <atbe> So, is it safe to say that the Mongo Driver has some network discovery code included that runs as long as you include the library?
[13:22:22] <cheeser> atbe: no. that's utterly wrong.
[13:23:03] <cheeser> there's no way for the driver to know what machines to talk to until you give at least one host in a constructor call on MongoClient
[13:23:29] <atbe> so there are no static blocks executing network code when loaded into the jvm/
[13:23:49] <atbe> I am just trying to be as thorough as can be
[13:23:52] <cheeser> of course not. how would they know who to talk to?
[13:25:11] <atbe> cheeser: good deal, thanks
[13:38:35] <kenalex> hello
[13:46:02] <CustosLimen> cheeser, https://bpaste.net/show/12f45396b822
[13:46:11] <CustosLimen> it gives number/object/string etc
[13:46:17] <CustosLimen> and does not say what object
[13:46:29] <CustosLimen> and I'm not sure what to do with number
[13:46:34] <CustosLimen> is number NumberLong ?
[13:46:36] <CustosLimen> or NumberInt ?
[13:48:43] <cheeser> i think that's as close as you're going to get. why does the type number matter?
[13:53:22] <jokke> i synced the db to a new secondary, swiched off the primary and want to elect the secondary to primary again
[13:54:01] <cheeser> you can't force a specific machine to become the primary but you can ask the current primary to step down
[13:54:04] <Derick> you can do a step down on...
[13:54:06] <Derick> what cheeser said
[13:54:07] <cheeser> that will force an election
[13:54:11] <jokke> oh
[13:54:17] <jokke> i see
[13:54:35] <cheeser> you can muck about with priorities but that's generally discouraged.
[13:54:51] <jokke> how do i make the primary to step down
[13:54:56] <jokke> -to
[13:55:00] <cheeser> open the shell: rs.step
[13:55:03] <cheeser> open the shell: rs.stepDown()
[13:55:21] <cheeser> on that primary, of course
[13:55:26] <jokke> wat... "not primary so can't step down"
[13:55:32] <jokke> (that was on the primary)
[13:55:37] <cheeser> rs.status()
[13:55:45] <cheeser> that'll show you the primary
[13:55:58] <jokke> mhm
[13:56:03] <jokke> "Our replica set config is invalid or we are not a member of it",
[13:56:16] <jokke> yeah ok
[13:56:22] <jokke> i have to add id again as member
[13:56:25] <jokke> it
[13:56:37] <cheeser> and wait for sync
[14:01:11] <jokke> now my log is full of these messages: replSet couldn't elect self, only received 1 votes, but needed at least 2
[14:02:21] <cheeser> how many members are shown in rs.status()
[14:02:23] <cheeser> ?
[14:02:26] <jokke> two
[14:02:43] <cheeser> yeah. you'll need one more member. could be an arbiter.
[14:02:52] <jokke> ah yeah ok
[14:06:57] <nikitosiusis> https://docs.mongodb.com/manual/faq/concurrency/#how-does-concurrency-affect-secondaries this says mongo applies replication oplog in batches and doesn't allow reads while that. can I tune this somehow? When I apply many writes to primary, I get significant read time increase on secondary
[14:07:55] <cheeser> why are you trying to read from the secondary?
[14:11:32] <nikitosiusis> because master dies when I try to read and write simultaneously
[14:11:59] <cheeser> master? or primary?
[14:12:11] <cheeser> and the primary shouldn't die like that
[14:17:17] <nikitosiusis> reading from secondary is best practice as written in doc, that's why we are doing this
[14:17:45] <nikitosiusis> so what about tuning those bulk writes? can I change size or something?
[14:17:56] <cheeser> mongodb's docs say that?
[14:19:27] <cheeser> https://docs.mongodb.com/manual/core/read-preference/#counter-indications
[14:22:38] <nikitosiusis> ok here is that https://docs.mongodb.com/manual/reference/parameters/#replication-parameters
[14:38:19] <nikitosiusis> but it doesn't seem to work errmsg" : "can't set replApplyBatchSize on a non-slave machine"
[14:52:42] <atbe> cheeser: hey, silly question, log verbosity 0 will show less information than 5?
[14:52:48] <cheeser> yes
[14:54:16] <atbe> cool, for a standalone, will things like queries and commands be included in the log?
[14:54:32] <cheeser> only slow ones
[14:58:00] <atbe> can I have it log all activity? Like completely?
[14:58:16] <cheeser> crank it up
[14:58:22] <atbe> to 5?
[14:58:23] <atbe> I did
[14:58:29] <Derick> 11!
[14:58:33] <Derick> oh no, that's the sound system
[14:58:34] <atbe> srsly??
[14:58:36] <atbe> lol
[14:58:40] <atbe> TT__TT
[15:00:25] <atbe> http://webcache.googleusercontent.com/search?q=cache:cxhFg7Wq57AJ:stackoverflow.com/questions/15204341/mongodb-logging-all-queries+&cd=5&hl=en&ct=clnk&gl=us
[15:00:27] <atbe> pew pew
[15:14:14] <AAA_awright> Hey, I popped in here a while back reporting troubles with MongoDB eating 110% CPU sitting idle doing nothing, no I/O, no network, empty data
[15:14:37] <AAA_awright> It appears it was caused by a leapsecond bug fixed with a `sudo date` operation
[15:15:02] <AAA_awright> https://www.mongodb.com/blog/post/mongodb-and-leap-seconds claims MongoDB is leapsecond-safe, is this really the case?
[15:15:04] <cheeser> ha! ouch.
[15:15:43] <AAA_awright> This was bothering me for weeks and I just gave up until recently. A restart may have also helped, but I can't easily do that since it's a production machine
[15:16:01] <cheeser> replica set?
[15:16:20] <AAA_awright> This is stand-alone, single node
[15:16:27] <cheeser> ah. well. step one. :)
[15:16:28] <AAA_awright> I also run MySQL and that didn't have any problems
[15:16:36] <AAA_awright> so idk what part of this is MongoDB-specific
[15:17:24] <cheeser> yeah, i dunno either.
[15:18:15] <jaelae> hmf mongodb cloud manager was down earlier right?
[15:18:40] <cheeser> it's still a bit bouncy
[16:41:23] <sivli> Hey all
[17:06:24] <nikitosiusis> should I use lvm when using mongo?
[17:06:47] <humanBird> mongoloidDB?
[18:46:50] <boutell> Hi. I’m getting an “Overflow sort stage buffered data usage of 33559541 bytes exceeds internal limit of 33554432 bytes” error. I know this means I’m trying to do a sort() on something that isn’t indexed. And I’m gazing at the mongodb logs to figure out what. But while they tell me exactly what the query is, they don’t seem to tell me what the sort is at all. ): This is 2.6. Any thoughts on how to find out what s
[18:46:50] <boutell> is generating the error?
[18:47:51] <boutell> Looks like I’m stuck increasing the limit in the meantime, but I know that’s bad practice.
[18:48:44] <boutell> ooh, and since I’m on 2.6 I can’t. (:
[19:05:57] <cjhackerz> hello
[19:06:24] <cjhackerz> i need help in installing mongo db on rhel 7
[19:06:50] <cjhackerz> i followed all instructions from documentation page
[19:07:03] <cjhackerz> but still i am getting error
[19:08:18] <cjhackerz> check log file output here :- http://pastebin.com/jdEjwtdS
[19:09:10] <cjhackerz> please help asap
[19:10:53] <cheeser> looks like you lack the filesystem permissions to delete that .sock file
[19:11:51] <cjhackerz> hmm
[19:12:06] <cjhackerz> any idea to fix the issue?
[19:12:17] <cheeser> su to the correct user and try again
[19:13:17] <cjhackerz> i have only one user in my system that is me as admin account..
[19:13:46] <cheeser> that's not true.
[19:14:22] <cjhackerz> so mongo db has created user? while installation ?
[19:14:35] <StephenLynx> no
[19:14:37] <cheeser> linux, in general, has several users defined
[19:14:44] <StephenLynx> ah hold up
[19:14:50] <cheeser> who owns that file?
[19:14:53] <StephenLynx> you talking about the user that starts the process?
[19:14:54] <cheeser> ls -al <that file>
[19:15:27] <cjhackerz> ok let me check
[19:17:13] <cjhackerz> [root@dhcppc7 tmp]# ls -al mongodb-27017.sock
[19:17:13] <cjhackerz> srwx------. 1 root root 0 Jul 20 23:31 mongodb-27017.sock
[19:17:26] <cheeser> are you trying to start as root?
[19:17:27] <JustinHitla> is that channel about pokemongo ?
[19:17:33] <cheeser> no
[19:18:03] <cheeser> if this channel was about pokemon it would say so in the name
[19:18:08] <StephenLynx> kek
[19:18:17] <JustinHitla> cheeser: it partly says it in the name
[19:18:22] <StephenLynx> which channel?
[19:18:23] <cheeser> no it doesn't.
[19:18:32] <JustinHitla> "mongo"
[19:18:48] <cheeser> #mongodb
[19:18:49] <JustinHitla> it will say "pokemongo" if you poke it
[19:18:55] <cheeser> please move along
[19:18:59] <cjhackerz> i tried to start the service as root also by using sudo non of them are working
[19:18:59] <StephenLynx> kek
[19:19:37] <JustinHitla> you can't caught me
[19:20:02] <cjhackerz> should i change file owner ship to fix problem?
[19:20:46] <cjhackerz> cheeser?
[19:21:17] <cheeser> start it as root and you should be fine.
[19:21:31] <cheeser> though why that .sock file remains is something you should figure out
[19:21:38] <cheeser> is there stilla mongod runnign?
[19:22:30] <cjhackerz> [root@dhcppc7 tmp]# mongo
[19:22:30] <cjhackerz> MongoDB shell version: 3.2.8
[19:22:31] <cjhackerz> connecting to: test
[19:22:31] <cjhackerz> 2016-07-21T00:43:04.225+0530 W NETWORK [thread1] Failed to connect to 127.0.0.1:27017, reason: errno:111 Connection refused
[19:22:31] <cjhackerz> 2016-07-21T00:43:04.275+0530 E QUERY [thread1] Error: couldn't connect to server 127.0.0.1:27017, connection attempt failed :
[19:22:33] <cjhackerz> connect@src/mongo/shell/mongo.js:229:14
[19:22:35] <cjhackerz> @(connect):1:6
[19:22:38] <cjhackerz> exception: connect failed
[19:22:39] <cjhackerz> nope
[19:22:39] <cheeser> don't paste here
[19:22:41] <cjhackerz> not running
[19:22:43] <cjhackerz> ok
[19:23:10] <cheeser> ps axuw | grep mongod
[19:25:19] <cjhackerz> hmm
[19:26:09] <cjhackerz> ok i executed your command
[19:26:27] <cheeser> and?
[19:26:50] <cjhackerz> can i paste output here? or in your pm?
[19:26:57] <cheeser> gist.github.com
[19:28:27] <cjhackerz> https://gist.github.com/anonymous/447a132f3b5bf0be42562ad052899998
[19:28:52] <cheeser> ok. looks like you can delete that file and try to restart your service then
[19:29:34] <cjhackerz> ok cool going to delete it manually
[19:30:46] <cjhackerz> yaaay done
[19:31:09] <cjhackerz> thank you so much cheeser
[19:31:12] <cheeser> np
[19:31:16] <cheeser> happy hacking
[19:31:21] <cjhackerz> :p
[19:51:19] <AndrewYoung> 'ello
[19:51:46] <Derick> hi
[20:11:59] <xmad> Does $lookup always do a full scan or can it be limited by a $match operator?
[20:13:44] <cheeser> https://docs.mongodb.com/manual/reference/operator/aggregation/lookup/
[20:14:02] <cheeser> if you're doing a $lookup on a field, you'll that field indexed.
[20:18:21] <xmad> Cool, would a $match followed from a $lookup use an index, if available?
[20:19:40] <cheeser> this is not my area, but i would expect that the optimizer would attempt to move that $match before the $lookup. but if you're trying to $match on the values of the $lookup result, it couldn't do that of course.
[20:19:48] <cheeser> it also couldn't use an index, afaik.
[20:23:30] <xmad> Yeah, assuming the $match is operating on fields that come from $lookup
[20:24:31] <xmad> https://docs.mongodb.com/manual/reference/operator/aggregation/match/
[20:24:38] <xmad> "If you place a $match at the very beginning of a pipeline, the query can take advantage of indexes"
[20:24:55] <cheeser> it's entirely possible the optimizer is smart enough to detect that filter and apply it to the lookup but i doubt it.
[20:26:32] <JFlash> hi, mongodb is being grumpy again
[20:26:34] <JFlash> http://hastebin.com/momefemene.vhdl
[20:27:07] <xmad> Thanks so much. I think it's fairly safe to assume that $match doesn't use indexes if it's not at the beginning (given the wording from the doc). Maybe $match can be optimized for $lookup on later versions
[20:27:11] <xmad> $lookup is a fairly new feature after all
[20:28:02] <JFlash> hi think the last time i got it working by manually passing the config file
[20:28:25] <JFlash> but how it get this errror when I try passing -f
[20:28:28] <JFlash> F CONTROL Failed global initialization: FileNotOpen Failed to open "/var/log/mongodb/mongod.log"
[20:33:05] <cheeser> JFlash: permissions problem on that file. you're not starting mongod as the correct user.
[20:34:46] <JFlash> hmm
[20:34:52] <JFlash> im looking at the logs
[20:34:55] <JFlash> I got this line
[20:34:57] <JFlash> Failed to unlink socket file /tmp/mongodb-27017.sock errno:1 Operation not permitted
[20:35:02] <cheeser> permissions
[20:35:06] <JFlash> could this be the reason
[20:35:14] <JFlash> should I just sudo service start ...
[20:36:39] <cheeser> yeah. you generally have to be root to start services.
[20:37:28] <JFlash> thanks , it worked
[20:57:01] <sivli> Hi all, I have a quick question about aggregation and $out if anyone has a moment.
[21:01:50] <AndrewYoung> What's your question?
[21:02:11] <sivli> Just wondering what the best way to aggregate data to a constant collection w/o deleting existing docs.
[21:02:35] <cheeser> $out will overwrite the existing data.
[21:02:52] <sivli> I am aware, which is my problem.
[21:03:53] <sivli> I could just have it returned to my app server and then bulk insert but that seems a needless usage of network resources.
[21:04:42] <AndrewYoung> There's no easy way for mongo to know if it can cleanly insert all those records or not. If there are _id or other unique index constraints on the existing collection it would cause problems with the insert.
[21:05:04] <sivli> makes sense, I figured that was the reason.
[21:05:11] <AndrewYoung> That's just my guess. :)
[21:05:29] <sivli> yep :)
[21:06:50] <sivli> So what I am wondering is there a way to cleanly do a collection to collection copy or do I really need to send the results over the wire both ways?
[21:07:14] <sivli> The col to col option likely has the same issues as $out (assuming that is the reason for the limitation)
[21:07:27] <AndrewYoung> You could use server side javascript to do it.
[21:08:02] <AndrewYoung> But server side javascript requires a global lock.
[21:08:44] <AndrewYoung> It would probably be better to do it the "naive" way by implementing it in code on your app server.
[21:09:21] <sivli> True but my mongo host (Atlas) likely does not have a way to do in locally. So I would have to do it over the wire.
[21:10:05] <sivli> Which I can... but is feels like such a waste. Thus here I am asking for other options :)
[21:10:14] <AndrewYoung> Yeah, I hear ya.
[21:12:34] <sivli> Ah well, will stick around for a while in case anyone has other input. Thanks for listening AndrewYoung :)
[21:12:44] <AndrewYoung> No problem. :)
[21:13:13] <AndrewYoung> The only other thing I can think of is using mongoexport/mongorestore, but that will still go over the network in your case.
[21:36:51] <warp0x00> why is my mongod using 15GiB of swap and how do I stop it from doing that
[21:39:04] <AndrewYoung> What version are you running?
[21:39:25] <warp0x00> db version v3.2.8
[21:42:53] <AndrewYoung> Running on Linux? Are you seeing the swap usage in top?
[21:43:44] <AndrewYoung> Or vmstat?
[21:43:55] <warp0x00> it shows up in top and in free
[21:47:08] <warp0x00> The documentation says:
[21:47:19] <warp0x00> Nevertheless, systems running MongoDB do not need swap for routine operation. Database files are memory-mapped and should constitute most of your MongoDB memory use. Therefore, it is unlikely that mongod will ever use any swap space in normal operation.
[21:48:00] <warp0x00> but clearly that's wrong because linux doesn't swap memory-mapped files only anonymous memory
[21:48:20] <AndrewYoung> That bit of the documentation is talking specifically about the MMAPv1 storage engine.
[21:49:12] <AndrewYoung> Which uses mmap to map files to virtual memory.
[21:49:22] <AndrewYoung> However, the default storage engine for the version you're using is WiredTiger.
[21:50:18] <warp0x00> Okay let me switch
[21:52:19] <AndrewYoung> How much RAM is in that box?
[21:52:30] <warp0x00> 32GiB
[21:52:59] <JFlash> say i have a document with a text property
[21:53:15] <JFlash> and a collection with many of these documents
[21:53:28] <JFlash> how do I find out the most used word
[21:53:50] <AndrewYoung> Changed in version 3.2: Starting in MongoDB 3.2, the WiredTiger internal cache, by default, will use the larger of either:
[21:53:50] <AndrewYoung> 60% of RAM minus 1 GB, or
[21:53:51] <AndrewYoung> 1 GB.
[21:53:51] <JFlash> in other words, does mongo offer any functionality to search inside the text?
[21:54:13] <AndrewYoung> JFlash: https://docs.mongodb.com/manual/reference/operator/query/text/
[21:54:21] <JFlash> oo
[21:54:34] <AndrewYoung> You'll have to set up a text index on the fields that you want to search.
[21:54:58] <JFlash> right...
[21:55:02] <AndrewYoung> You can also search text strings using a regex, but it'll be slower.
[21:55:04] <JFlash> so that will not do it
[21:55:15] <JFlash> I have to tokenize the text then
[21:55:25] <JFlash> so maybe i need to use map reduce?
[21:56:24] <JFlash> what if I saved the text as tokens prior to insertion
[21:56:36] <AndrewYoung> warp0x00: You might just need to adjust the WiredTiger cache size, but that's just a guess. https://docs.mongodb.com/manual/faq/storage/#wiredtiger-storage-engine
[21:56:37] <JFlash> that would be sweet
[21:56:55] <JFlash> so now each tweet would have an array called tokens
[21:57:10] <JFlash> so now I guess there should be a way to do it using agregate
[21:57:10] <AndrewYoung> Yeah, if you put an array of tokens into the document instead of a pure text field, that would make your problem easier.
[21:57:53] <warp0x00> AndrewYoung, that fixed it. However, the documentation is out of date esp. re: "why is mongo using so much memory"
[21:58:12] <JFlash> how can I use aggregate to search for the most common items in a given array... do I have to unwind?
[21:58:29] <AndrewYoung> warp0x00: The cache size change fixed it?
[21:58:34] <warp0x00> Yes.
[21:58:41] <AndrewYoung> Ah, cool.
[21:59:31] <AndrewYoung> JFlash: Yeah, I would probably $unwind and then use $sum.
[22:00:33] <JFlash> AndrewYoung, but also groupBy token, right?
[22:00:55] <JFlash> what do I groupby for?
[22:01:31] <AndrewYoung> warp0x00: Is the wording here useful at all? (I'm asking about the documentation quality, specifically) https://docs.mongodb.com/manual/administration/production-notes/?_ga=1.27790261.866869998.1468255429#hardware-considerations
[22:01:40] <AndrewYoung> Look at the bit in green. (The note)
[22:02:13] <AndrewYoung> JFlash: I would do a compound of the _id and the token field, probably.
[22:02:31] <JFlash> hmm
[22:02:38] <AndrewYoung> That would give you a count of how many times that token shows up in the list for that document.
[22:02:39] <JFlash> thanks
[22:02:48] <JFlash> okey
[22:02:56] <AndrewYoung> No problem. I'm still relatively new to MongoDB, so take what I say with a grain of salt. ;)
[22:03:37] <JFlash> yeah.. see I dont need a number of x words per documment , I need a general number in all documents
[22:03:48] <AndrewYoung> Then just group by the token and leave the _id out.
[22:03:53] <warp0x00> AndrewYoung, yeah that's good. But, all of the answers on stackoverflow seem to predate WiredTiger and 3.2
[22:04:09] <warp0x00> https://www.google.ca/search?q=why+is+mongodb+using+so+much+memory
[22:04:11] <JFlash> sure, I will post the code here if it doest work
[22:04:14] <AndrewYoung> warp0x00: Ah, crappy.
[22:04:37] <warp0x00> top 6 results for me
[22:05:17] <AndrewYoung> Yeah, those are all old links.
[22:05:48] <AndrewYoung> You might set the search time to "past year".
[22:06:33] <AndrewYoung> Glad you got it working though. :)
[22:08:58] <warp0x00> if someone made a new SO/Severfault question it would likely just get marked as duplicate. I wonder if someone with lots of stack overflow points can go fix them or if there is even a mechanism of fixing answers that are wrong because they're out of date on SO
[22:09:13] <AndrewYoung> That's a great question.
[22:09:20] <AndrewYoung> It must be a common problem.