pmxbot IRC Log Viewer

[01:18:00] <bmw0679> I've run into a write lock problem because of lack of disk space. This prevents me from running db.dropDatabase() which would solve my disk space problem. If I remove the database files manually from the file system would this cause a problem?

[01:55:21] <dvpalex_> Hi

[02:13:13] <Semor> Is mangodb a ram db?

[02:25:16] <ranman> mongodb?

[02:26:58] <Semor> yes

[02:27:03] <Semor> mongodb

[02:28:59] <darius93> ram database Semor? I doubt it

[02:29:22] <darius93> if you want to store the db in a ramdisk you can easily but you must make backups to prevent data lost

[02:29:45] <Semor> In my application ,I save my data by executing SQL sentences , is mongodb advanced than this ?

[02:31:18] <Semor> I save my data in ram by c++ struct ,then in another db thread execute sql transections to do persistence

[02:33:26] <darius93> no. Its more of a json-like format but almost has similar functions.

[02:33:27] <darius93> http://docs.mongodb.org/manual/reference/sql-comparison/

[02:33:53] <darius93> i wouldnt recommend storing the data in ram for a long period of time

[02:34:12] <darius93> possibly dump it to disk and clean it out to of memory and call on it when needed

[02:34:29] <Semor> darius93: my db thread save for 1 second interval

[02:35:02] <darius93> hmm

[02:35:19] <Semor> darius93:if not use a ram cache ,my application may be some slower

[02:35:20] <darius93> well you can go on and try mongo out if you want. check out that link to see the comparison between the two

[02:35:37] <darius93> you can store the data in memory and save it to the db

[02:35:49] <darius93> but i wouldnt recommend saving it every second unless your server is that fast

[02:35:58] <darius93> and a native application being slow?

[02:36:29] <Semor> darius93:yes ,but if I use mongodb ,how is the operation procedure?

[02:37:02] <Semor> darius93:I do not know whether mongodb implement db cache in ram

[02:37:44] <Semor> And one mongodb sql operation whether directly operate on the db disk ?

[02:41:53] <darius93> well i havent used mongo in C/C++ yet but overall e.g INSERT INTO EXAMPLE ('section') values ('data'); would be like doing db.example.insert({section:'data'});

[02:44:22] <darius93> Semor: if you dont mind me asking, what type of app are you building?

[03:03:36] <Semor> darius93:I am building a network server ,receiving client's operations and calculating results , then saving to db

[03:04:08] <darius93> tcp server?

[03:04:56] <Semor> darius93:Yes , And my server has its matching client

[03:05:59] <Semor> they communicate with each other with tcp protocol

[03:06:07] <darius93> cool

[03:06:48] <Semor> I am using mysql now

[03:07:31] <darius93> Ah ok. I would suggest MariaDB over MySQL but if you need too, use mongodb

[03:07:42] <darius93> all depends on your needs

[03:07:50] <Semor> MariaDB?

[03:08:08] <darius93> Yep. Its basically a fork of MySQL

[03:08:22] <darius93> but have different things that mysql doesnt

[03:08:38] <darius93> it works out of the box and is compat with mysql dbs

[03:09:07] <Semor> Do I need to write sql sentences to save data?

[03:10:34] <darius93> you can leave the statements you currently have

[03:11:20] <Semor> darius93:why do not you recommend mongodb ?

[03:11:36] <darius93> I would recommend it

[03:11:46] <darius93> but it all depends on your needs

[03:12:05] <darius93> mongodb is a document db, while mysql/mariadb is a rational db

[03:12:42] <darius93> NoSQL is scalable and MariaDB can be scalable too

[03:12:57] <darius93> but it depends on how you use and take advantage of it

[03:13:20] <Semor> darius93:my requirement is faster running and easy to use

[03:14:58] <darius93> Any database can be fast, its just how you use it to make it perform better. I think MongoDB may be worth a try then, or MariaDB

[03:16:34] <Semor> darius93: I want to refer to some c++ using mongodb examples

[03:17:00] <darius93> http://docs.mongodb.org/ecosystem/tutorial/getting-started-with-cpp-driver/

[03:17:05] <darius93> that should help you out Semor

[03:19:14] <Semor> darius93: thanks!

[04:50:11] <SlexAxton> hi, if I have a sparse index on two properties, and one is nil, and the other exists, does it still get indexed?

[04:50:54] <SlexAxton> (sparse, unique index)

[04:51:41] <SlexAxton> and if it does get picked up, does nil count towards the uniqueness?

[04:52:03] <SlexAxton> [nil, x] can only occur once

[04:52:13] <SlexAxton> thanks!

[04:58:02] <SlexAxton> i think this might be related

[04:58:02] <SlexAxton> https://jira.mongodb.org/browse/SERVER-785

[04:58:04] <SlexAxton> and is open

[04:58:10] <SlexAxton> so probably not in stable

[06:42:53] <b0ss_> so mongodb does not implement a many-to-many relationship in something similar to the JOIN TABLES, right? It just place the other object's ID onto the required object ?

[06:46:41] <b0ss_> However, Managing these relationships is the difficult part. Say, if you delete one side you have to manually update the other side. Right ?

[07:39:52] <omid8bimo> i need some guidance on something. i have 3 mongodb servers in my replica set, 1 master mongodb, 1 slave mongodb and one as arbiter; my slave server crashed yerterday and after lots of file system repair, server become ok but mongodb couldnt start, so i removed the /var/lib/mongodb, created an empty directory for it and started the service to get the data back from master, but only half of the data got

[07:39:53] <omid8bimo> replicated. is that normal?

[08:42:22] <Gargoyle> omid8bimo: Nope. Sound like the server is stull fubar.

[08:48:07] <omid8bimo> Gargoyle: could you explain more?

[08:48:40] <omid8bimo> first of all, how can i know how much data has been saved in oplog? so that i can understand when i will start overwriting

[08:49:16] <omid8bimo> second of all, why mongodb would not replicate the whole dataset when a fresh mongo instance joined the replicaset?

[08:55:49] <Gargoyle> omid8bimo: I can't think of any reason why under a normal configuration the whole data would not be sync'd. Are there any errors in the logs?

[08:56:10] <Gargoyle> Dunno about checking the oplog size, but don't think that would make any difference ether.

[08:57:20] <omid8bimo> Gargoyle: no errors. db.printSlaveReplicationInfo() on master said secondary is 20 hours behind

[08:57:39] <Gargoyle> Is it still syncing?

[08:57:43] <omid8bimo> and db.printReplicationInfo() says:

[08:57:48] <omid8bimo> configured oplog size: 102400MB

[08:57:50] <omid8bimo> log length start to end: 141318secs (39.26hrs)

[08:58:14] <omid8bimo> does this mean my oplog for keep around 40 hours of data?

[08:59:34] <omid8bimo> Gargoyle: master db db.printSlaveReplicationInfo() says syncedTo: 1 seconds ago, but data on secondary that came up like 1 hour ago is still 30 GB behind master!

[09:04:41] <Gargoyle> What does printSlaveReplication() on the slave say?

[09:07:24] <omid8bimo> says its not a function!

[09:07:34] <omid8bimo> do you mean printSlaveReplicationInfo()?

[09:07:41] <Gargoyle> yeah

[09:08:15] <omid8bimo> p_set:SECONDARY> db.printSlaveReplicationInfo()

[09:08:16] <omid8bimo> source: kookoja-web:27017

[09:08:18] <omid8bimo> no replication info, yet. State: ARBITER

[09:08:20] <omid8bimo> source: kookoja-db01r1s0:27017

[09:08:22] <omid8bimo> syncedTo: Fri Jan 24 2014 12:35:30 GMT+0330 (IRST)

[09:08:24] <omid8bimo> = 0 secs ago (0hrs)

[09:10:36] <kali> omid8bimo: what makes you think only half the data got there ?

[09:10:58] <kali> omid8bimo: have you checked the database from inside, or just look at the file size ?

[09:12:26] <omid8bimo> becase of two factors, one is that total size of /var/lib/mongodb is different on both servers. (secondary is aroudn 20GB less) and also, based on my collections, i have kcollection.48 on master but only kcollection.43 on secondary

[09:13:27] <Gargoyle> omid8bimo: Don't think you can count disk space size. Your master will probably have "holes" in it's data files, your secondary will be nice and neat to start with.

[09:14:16] <omid8bimo> Gargoyle: so to be sure that my data on secondary is up to date, can i do this?

[09:14:23] <omid8bimo> remove the /var/lib/mongodb

[09:14:35] <omid8bimo> create an empty folder

[09:15:16] <omid8bimo> start mongod on secondary with replicaSet config and wait until it sync the entire /var/lib/mongodb on master which is 180 GB

[09:15:44] <kali> omid8bimo: yes. just don't expect the size on disk to be similar, it has no reason to be

[09:16:19] <omid8bimo> kali: ok but collection names or the quantity must be the same, correct?

[09:16:43] <kali> omid8bimo: yes. except stuff from the local db

[09:17:04] <omid8bimo> like if i have up to kcollection.48 on master, i must have kcollection.48 on secondary as well?

[09:17:11] <kali> yeah

[09:17:21] <kali> ha !

[09:17:24] <kali> no !

[09:17:29] <omid8bimo> ?

[09:17:30] <kali> that's a file name

[09:17:37] <kali> right ?

[09:17:53] <omid8bimo> yup

[09:18:06] <kali> "kcollection" is a database name, it needs 48 data "segments" on your primary

[09:18:28] <kali> but it will very likely require less when synced on your secondary

[09:18:37] <Gargoyle> omid8bimo: Perhaps you should stop looking at the files and use the shell to check things and leave mongo to "do its thing" with your files!

[09:18:40] <kali> same problem than size on disk

[09:18:47] <kali> Gargoyle: +1

[09:21:16] <omid8bimo> all right. so tell me a cmd so i can use to compare data on both

[09:21:16] <omid8bimo> something like "show dbs" is good enough?

[09:21:30] <Gargoyle> rs.status() ?

[09:21:56] <kali> rs.status() should show the secondary in state secondary (not recovering)

[09:22:32] <kali> you can also look at the rsSync lines in the secondary logs

[09:23:17] <kali> but bottom line is, the procedure you followed is very really robust (as long as the oplog is deep enough, and it sounds like it is)

[09:23:43] <kali> stop messing with your server, it's fine :)

[09:23:55] <omid8bimo> kali: ok, here is my rs.status() output on secondary http://paste.debian.net/78117/

[09:24:16] <kali> yep, purring like a kitten

[09:24:35] <kali> leave it alone

[09:24:49] <omid8bimo> kali: ok :) so its up to date and working well enough

[09:25:41] <omid8bimo> one more thing, how much approximately does it take to do a initial sync on a LAN connection for 180 GB data?

[09:26:00] <omid8bimo> if ever i wanted to do a initial sync?

[09:26:27] <Gargoyle> omid8bimo: define LAN ?

[09:27:37] <kali> it depends on about a dozen factor...

[09:27:40] <omid8bimo> Gargoyle: normal 1000GB connection

[09:27:55] <omid8bimo> oops! i meat 1GB/ps

[09:27:58] <Gargoyle> omid8bimo: 100GB is not normal! :P

[09:27:59] <omid8bimo> *meant

[09:28:45] <Gargoyle> omid8bimo: Data throughput on your ethernet will be about 75% of it's speed.

[09:29:28] <omid8bimo> so im guessing several days to do a initial sync of 180GB data from master to a new secondary

[09:31:29] <Gargoyle> So. do 180GB * 8 = 1440Gbits/sec * 1024 = 1,474,560Mbits / 750mbps = 1,966 seconds = about 30 mins.

[09:31:58] <kali> more in the ballpark of one hour, yeah

[09:33:07] <omid8bimo> Gargoyle: just 30 mins?

[09:33:42] <Nodex> max throughput would be 30 mins

[09:33:50] <Nodex> in reality it's closer to an hour

[09:35:22] <Gargoyle> Other factors will make a difference, disk speed, other network traffic, etc. But theiretically, your two machines should be about to shove 180GB down the wire in about 30 mins (theoretical max).

[09:36:41] <omid8bimo> ok i wanna do something for test. i just stopped secondary, renamed /var/lib/mongodb, created an empty folder with permissions set, started and now it start syncing data from master

[09:36:56] <omid8bimo> but i dont see any recovering state in rs.status()

[09:37:24] <kali> it might be "initial sync" or something like that

[09:38:39] <omid8bimo> even i tried grep on mongodb.log for ip address of master, and im getting liek 10 connections per 5 mins

[09:38:48] <balboah> also building indexes will pause replication

[09:39:05] <Nodex> indexes get built after the sync

[09:39:06] <balboah> so not all about bandwidth

[09:39:14] <balboah> well before it becomes ready at least :)

[09:39:54] <Nodex> if the replication has high traffic there will be a lock while indexes are built stopping new data arriving which will resync once the index has built

[09:41:40] <omid8bimo> ok fair enough. i have MMS service configured on my servers. which graph on secondary host i should pey attention for the progress of replication/sync?

[10:29:09] <tiller> Hi!

[10:29:59] <tiller> Is there a way to check it -a field- has a specific value? A field being: "It doesn't matter which one".

[10:31:05] <tiller> For example, if I have: {_id: 1, subDocumentToTest: [ {a: "test"}, {b: "thing"}, {a: "hello", c: "test"} ] }

[10:31:29] <tiller> I want to have a query like : { "subDocumentToTest.?": "test" }

[10:35:47] <Nodex> no

[10:36:30] <tiller> Ok. I guess I'll just do a -big- OR

[10:36:50] <tiller> Thanks

[12:15:23] <byalaga> How to know when the oplog rotation happens on a primary? any help ?

[12:19:26] <kali> byalaga: what ?

[12:21:19] <byalaga> kali: Is there any way to estimate when a oplog is rotated ?

[12:22:00] <kali> byalaga: what do you mean "rotated" ?

[12:22:29] <kali> oplog is sliding, not rotating

[12:23:43] <byalaga> I mean, we need to take backup from a secondary node and we want to ensure the oplog on the primary node is not rotated/slided?

[12:23:44] <byalaga> so that whenever we bring this secondary node.. it should catch up with the primary

[12:25:02] <kali> byalaga: db.printReplicationInfo() on the primary will tell you what are the first and last event of your oplog

[12:25:20] <kali> byalaga: and the length of the time span covered

[12:26:24] <gnagno> hello all

[12:26:34] <kali> byalaga: apart from that i don't understand what you mean

[12:26:51] <gnagno> is it possible to create a replicaset using diffent mongodb versions? say I have a 1.8.1 and 2.4.x ?

[12:27:59] <byalaga> kali: http://pastie.org/8663455

[12:28:23] <byalaga> does that mean my oplog on this server can hold 62hrs of oplog?

[12:28:24] <kali> gnagno: heteogeneous replica sets are meant for version migration only. they will work for some combination but it is not recommended, and not supported as far as i know

[12:28:48] <kali> gnagno: with such a wide range (1.8 to 2.4) i would not even think about it

[12:29:03] <kali> byalaga: yes

[12:30:29] <byalaga> how can we say that?? the host may have 5operation in 1 sec sometimes.. and 50operations sometimes..

[12:30:53] <gnagno> kali, thank you, actually version migration is what I need, so you suggest me to migrate first from 1.8 to 2.0 then from 2.0 to 2.4 ?

[12:31:55] <kali> gnagno: look at the release notes for the release branch (2.0, 2.2, and 2.4). they will tell you for sure which one can be skipped. Don't forget 2.2

[12:32:26] <byalaga> kali: how can we say that exactly?? the host may have 5operation in 1 sec sometimes.. and 50operations sometimes..

[12:32:26] <byalaga> if the latest operation on my secondary may/may not be present on the primary

[12:34:47] <kali> byalaga: it's just based on what happened in the past. it holds 62 hours of backlog, but if your load change, it will obviously cover less time

[12:36:43] <byalaga> kali: my oplog size on all nodes is 2048MB. for how log can i keep my secondary node down for backups? so that.. if i bring it back it should be able to catch up with primary?

[12:38:32] <kali> byalaga: well, assuming your load does not change suddenly, about 62 hours

[12:39:57] <byalaga> "oplog first event time" on primary is not changing.. does that mean.. oplog is not sliding now?

[12:41:45] <kali> that's a bit weird... unless it is actually the beginning of the existence of the primary and every write since is fitting in the oplog

[12:42:24] <byalaga> exactly, so it is not yet filled and not sliding?

[12:42:37] <kali> sounds like it

[12:42:57] <byalaga> is there a way to check when it will start sliding?

[12:44:48] <_pash> hello, how long would it take to find a match of 12000 entries in a db and whats the best way to do that?

[12:45:06] <kali> byalaga: mmmm maybe try: use local; db.oplog.rs.stats();

[12:46:14] <byalaga> "count" : 5361108,

[12:46:14] <byalaga> "size" : 2061707724,

[12:47:13] <byalaga> "storageSize" : 2147487728,

[12:47:59] <byalaga> kali: any idea? how to calculate it from the above results?

[12:48:08] <kali> count is the count of documents, size and storageSize are quite close... so i'm not sure

[12:48:29] <gnagno> quick question: assume that I have a mongo server with a lot of data in it already running, if I create a replica set will it automatically migrate all the data also?

[12:48:49] <byalaga> ok thanks for you time

[12:48:55] <byalaga> kali: ^^

[12:49:04] <kali> gnagno: you want to migrate from standalone to replicated ?

[12:49:43] <gnagno> kali, yes

[12:49:58] <kali> gnagno: it's a relatively painless and well documented procedure: http://docs.mongodb.org/manual/tutorial/convert-standalone-to-replica-set/

[12:53:33] <_pash> how long would it take to run a search through 12000 entries and mongo and whats the best way of finding a match?

[12:57:26] <gnagno> thank you very much kali

[13:00:39] <Nodex> _pash : that's relative

[13:00:56] <Nodex> normally about the same length as a piece of string

[13:01:57] <_pash> Nodex: it would literally be like 12000 entries of usernames that still have not registered, and as soon as the user types that username, i need to check whether it is or not registered

[13:03:25] <kali> less than a tenth of milliseconds if you have an index on the username and it fits in RAM

[13:03:25] <Nodex> sometimes it can be shorter than the string but sometimes it's the same

[13:04:21] <_pash> kali: how do i make sure that its in ram?

[13:04:32] <_pash> what do i read up on? kali

[13:04:34] <kali> you buy enough of the stuff

[13:09:36] <_pash> kali: ensureIndex() ?

[13:11:28] <kali> come on, have a look at the doc

[13:11:40] <_pash> which part

[13:11:41] <kali> (yes. ensureIndex)

[13:11:49] <_pash> ok thanks

[15:42:20] <trueneu> Hi. Is it possible that PRIMARY and SECONDARY servers in the same replica set have different collections in the same db?

[15:42:52] <trueneu> No replication lag whatsoever. I did an initial sync for the SECONDARY server and it doesn't seem to be synced.

[15:47:42] <joannac> Unlikely, but anything's possible

[15:48:35] <cheeser> that'd be a pretty colossal bug...

[15:49:54] <joannac> I would say more likely you hit some problem, possibly corruption, or maybe you're not really up to date

[15:50:42] <trueneu> Right now it's 2.0.something version, probably there _was_ such a bug?

[15:51:39] <trueneu> I actually have 3 servers in this repl set, PRIMARY and one SECONDARY are 2.0.smth, and the second SECONDARY is 2.4.smth. Both secondaries have the same data sets

[15:52:00] <trueneu> That differ from the set that is on PRIMARY

[15:53:03] <trueneu> And rs.status() doesn't show anything abnormal

[16:01:08] <tiller> Hey, I've some difficulties with an upsert, can someone have a look? :/

[16:01:11] <tiller> http://pastebin.com/qHhECkjP

[16:13:51] <joannac> tiller: could you do it with elemMatch and the positional operator?

[16:15:21] <tiller> joannac> You mean within my $pull? If so, I've got the same error

[16:15:48] <joannac> No, without using a pull

[16:15:55] <joannac> just an update with elemMatch

[16:16:13] <tiller> ah

[16:16:17] <tiller> Lemme try :)

[16:17:31] <joannac> trueneu: erm, both secondaries have the same data and both were synced from scratch from the primary?

[16:17:47] <trueneu> Not sure about the second secondary

[16:17:52] <joannac> trueneu: secondary has a strict subset of the primary's data?

[16:18:25] <trueneu> I've checked the other secondary, the one I did initial sync for -- it synced from the second secondary :)

[16:18:39] <trueneu> joannac, yes, a strict one.

[16:19:12] <tiller> joannac> I think it won't work. Because if a subscription already exists for the (user, entity) but without the new tag I want to insert, the search query will fail, and I'll upsert a whole new thing. Don't I?

[16:19:23] <trueneu> I still do not understand why the replication works, but I guess I will just bring down both secondaries and do 2 initial syncs.

[16:21:58] <trueneu> Thanks for your help.

[17:12:39] <harrymoore> hi, all. can someone tell me if there is way to use $pull to remove a single matching array element? it looks like it pulls all matching elements

[17:22:42] <Joeskyyy> harrymoore: Could probably use the $ position operator, assuming you're wanting to remove the first occurence

[17:24:06] <harrymoore> @joeskyy: yes i want to remove the first (or any really) matching value

[17:29:18] <Joeskyyy> even with that though, it'd be setting it to a new value :\ there's no way to pop it out from the looks of it

[17:29:22] <Joeskyyy> http://docs.mongodb.org/manual/reference/operator/update/positional/

[17:29:32] <Joeskyyy> But that's what I was referencing, in case you're interested.

[17:30:05] <Joeskyyy> Otherwise you may want to have an array that each element has something like an id of sorts, and the content.

[17:30:15] <Joeskyyy> That way you can use pull on the id field

[17:30:48] <Joeskyyy> i.e. if it were an array of comments, each new comment has the comment field, and the id field (incremented by one)

[17:35:06] <harrymoore> thx @joeskyyy! i will just have to handle it in app code (java) instead: read the doc, remove the element in question and $set the entire array

[17:36:28] <Joeskyyy> no prob

[17:44:44] <NyB> Hi, does anyone have any tips for speeding up the Java driver? It seems that it has become the bottleneck in my application...

[17:48:28] <NyB> it just doesn't seem to be able to feed my application fast enough :-/

[17:49:16] <Joeskyyy> maybe harrymoore has something on that :D he was just talking about the java driver

[17:50:27] <NyB> Joeskyyy: :-)

[17:52:32] <NyB> I am thinking about using two or more driver threads in parallel, but I was hoping to avoid the complexity involved if there is a more standard way.

[17:53:23] <harrymoore> @Ny8: you can distribute your read requests if you have a replica set. the java driver is smart enough to read from the fastest responding node

[17:53:39] <harrymoore> you may have to set slave ok or something

[17:53:40] <NyB> as it is, my DB input thread seems to be spending 70% of its time in BasicBSONDecode.decode() :-/

[17:54:37] <NyB> harrymoore: the server is not the problem - the limiting factor seems to be the fact that I am reading a collection from a single thread

[17:56:33] <NyB> I believe I cannot use the result of find() safely from parallel threads, right? I could not find any relevant synchronization calls in the source coe...

[17:56:34] <harrymoore> i haven't done any multi-threaded work. assume the drive is ok with it though

[17:57:27] <kali> NyB: i would rather run several find() in parallel than sharing a cursor

[17:57:42] <kali> NyB: can you shard you find() somehow ? some kind of natural pagination ?

[17:58:27] <kali> i assume you're fetching a significant amount of data from a collection, right ?

[17:59:05] <NyB> kali: about 1,5 million documents sorted by a counter field.

[17:59:40] <NyB> kali: for my tests that is, the real thing will fetch quite a bit more

[18:00:07] <NyB> each document is a single object with ~40 fields

[18:00:16] <kali> you need all the fields ?

[18:00:42] <NyB> kali: unfortunately yes...

[18:01:09] <kali> there is one thing to be aware of, also. if your database is accessed by real time stuff (like a web app) at some point you'll manage to move the bottleneck to mongodb, and you web app will die

[18:01:51] <kali> sounds tricky

[18:01:57] <kali> you need the documents in order ?

[18:02:24] <NyB> kali: yes, the order is important

[18:02:55] <kali> then you'll need to reorder them once decoded before pushing them through the main thing

[18:04:00] <NyB> kali: hmm... are you suggesting fetching them out-of order from several threads and then sorting them?

[18:06:05] <kali> well whatever you do, if you decode them in parallel, you'll get them in disorder, at least locally

[18:08:43] <NyB> kali: I guess part of my problem is that the bottleneck does not lie on the MongoDB server - if it did I would have a couple of ways to handle it... as it is mongod consumes about 30% of a single CPU core and the I/O is nowhere near saturation yet...

[18:09:26] <NyB> kali: it seems that I will have to split the collection somehow...

[18:10:15] <NyB> kali: thanks for the tips (and for being a sounding board) :-)

[18:17:47] <NyB> kali: is there a way to have the server add an "order" field automatically after sorting? something that I could use with $mod to split the output?

[18:23:24] <kali> NyB: i don't think so

[18:27:42] <NyB> kali: yeah, I did not really believe that such a thing would exist...

[19:28:04] <ekristen> looking for some advice for hosting mongodb in aws, I’ve got in development 400-500k documents right now, looking to be in the 10s of millions in a few months

[19:28:38] <ekristen> looking for advice on aws instance sizes to start out on

[19:31:52] <harrymoore> @NyB if you're still there here is a gist showing threaded access to the driver. I don't write threaded code so take it for what it's worth: https://gist.github.com/harrymoore/8604212#file-app-java you could add a synchronized data structure for defining which thread will process which records

[19:32:33] <cheeser> what's all that for?

[19:50:43] <ekristen> quiet in here

[19:55:54] <cheeser> we're all busy refreshing gmail trying to get it load

[20:00:39] <whaley> I missed something... what's wrong with just setting connectionsPerHost on MongoClient?

[20:01:18] <whaley> NyB: ^

[20:46:04] <module000> ekristen: late reply...but i host in AWS with mongo. you need to look at the size of your data sets, and pick accordingly. for example, i have a 60GB data set, so i host it on a m2.4xlarge, which has 68GB of memory. if performance is less important, scale down your instance to include less RAM

[20:46:59] <ekristen> module000: my data set is currenly only like 6gb on disk, thats roughly 1GB per 100k documents we are storing at the moment

[20:47:16] <ekristen> module000: sounds like you aren’t using a replicaset either?

[20:47:45] <module000> ekristen: it's sharded on a m1.large, but that shard only exists to mitigate an outage if the primary shard fails

[20:48:21] <module000> ekristen: if you only have 6gb,, you could use a m3.large (7.5GB memory), and make a much smaller instance to shard it with, so if you have an outage your tiny node keeps you from having an outage (albeit you move slower)

[20:50:32] <ekristen> maybe I misunderstand the term shard, as what you are saying doesn’t line up with my understanding of replicasets and sharding

[20:51:06] <module000> ekristen: i'm not using replica sets, i'm using a sharded cluster deployment

[20:51:28] <module000> ekristen: you could do the same with replica sets though, just elect a smaller instance to serve as your secondary

[20:52:45] <ekristen> ah ok

[21:06:00] <luimemee> hello

[21:06:45] <ekristen> so module000 how are you running 3 config servers? and mongos?

[21:06:55] <luimemee> if have a collections and each elements have a tags list. how can i have a list of all the tags on all the element please ?

[21:09:02] <kali> luimemee: look for "distinct" in the documentation

[21:09:43] <luimemee> kali, thanks !

[21:11:38] <luimemee> kali, this is magic !

[21:13:23] <cheeser> https://www.youtube.com/watch?v=e7mmrF-4rUE

[21:31:58] <beardage> hi, how can I tell mongodb in python not to index my data im importing. Thing is that it's OOM while importing millions of records.

[21:46:17] <luimemee> this should work ? db.runCommand ( { distinct: "mycollection", key: "tags", query: {'tags':{$regex:'*toto*'}}} )

[22:27:22] <the-erm> I need some advice, is there a better way to do this? http://pastie.org/8664985 I'm trying to store the hours of business so it can be looked up, but the challenge is hours of operation are from 12pm-2am the next day.

[22:31:16] <the-erm> I was thinking I could possibly store the values 'Monday': [ [0,1], [1200,2359] ] but I'm not sure how to write a query for that.

[23:16:49] <unholycrab> anyone use rockmongo? i want to know how to remove the "drop database" link

[23:41:40] <JeffC_NN> unholycrab: If you're comfortable editing the PHP, edit app/controllers/db.php and remove the links/capabilities

[23:43:57] <JeffC_NN> (for very easy edits, just search the file for "drop" and add a simple "return;" at the beginning of functions named things like doDropDatabase() (line 488) and doDropDbCollections() (line 451) or whatever you want to disable. Pretty easy.

[23:47:32] <unholycrab> thanks JeffC_NN

Log file Viewer

Help | Karma | Search:

#mongodb logs for Friday the 24th of January, 2014